Data (Abstract) Examples in Math

Start with the recap, study the fully worked examples, then use the practice problems to check your understanding of Data (Abstract).

This page combines explanation, solved examples, and follow-up practice so you can move from recognition to confident problem-solving in Math.

Concept Recap

Data is a collection of recorded observations or measurements used to describe, analyze, or make inferences about a phenomenon or population.

Data is raw material for understanding—numbers, words, or categories we collect to answer questions.

Read the full concept explanation →

How to Use These Examples

  • Read the first worked example with the solution open so the structure is clear.
  • Try the practice problems before revealing each solution.
  • Use the related concepts and background knowledge badges if you feel stuck.

What to Focus On

Core idea: Data is the collection of observations or measurements you gather to answer a question about a group.

Common stuck point: The procedure for data (abstract) is the easy part; the trap is treating a summary like the mean as the data. Asking "Is this the raw set of recorded observations, before any summary or conclusion?" first is what keeps a correct-looking calculation from being attached to the wrong concept.

Sense of Study hint: Ask: Is this the raw set of recorded observations, before any summary or conclusion?

Worked Examples

Example 1

easy
A researcher records the following about 5 students: name, age, GPA, and favorite color. Classify each variable as quantitative or categorical, and explain the difference.

Answer

Quantitative: age, GPA. Categorical: name, favorite color.

First step

1
Name: categorical — labels, not numeric quantities

Full solution

  1. 2
    Age: quantitative — numeric, can be averaged (e.g., mean age = 17.4)
  2. 3
    GPA: quantitative — numeric, arithmetic operations are meaningful
  3. 4
    Favorite color: categorical — labels with no inherent numeric order
  4. 5
    Key distinction: quantitative variables measure amounts; categorical variables classify into groups
Data abstraction begins with identifying the type of each variable. Quantitative data supports arithmetic operations; categorical data supports counting and proportions. Applying the wrong analysis to the wrong type leads to meaningless results.

Example 2

medium
A survey asks: (1) What is your ZIP code? (2) How many hours do you sleep per night? (3) Rate your satisfaction 1–5. Classify each and identify potential misclassification pitfalls.

Example 3

easy
Look at three columns of a spreadsheet: 'age', 'eye color', 'shoe size'. Which are numerical and which are categorical?

Example 4

medium
A spreadsheet has 200 rows and 5 columns. How many data values does it hold, and what do rows and columns represent?

Example 5

hard
Pick the best summary for each variable: (a) eye color across 50 students, (b) heights of 50 students, (c) satisfaction (1-5) of 50 students.

Example 6

challenge
You want to study TV-watching habits in a city. Compare these two data-collection plans: (Plan A) survey 100 people leaving a movie theater; (Plan B) randomly call 100 phone numbers in the city. Which is better, and why?

Practice Problems

Try these problems on your own first, then open the solution to compare your method.

Example 1

easy
Classify each variable: (a) blood type (A, B, AB, O), (b) temperature in Celsius, (c) jersey number, (d) number of goals scored.

Example 2

medium
A data set contains 1000 rows and 8 columns. Explain what a row and a column represent, and define the terms 'observation', 'variable', and 'case' in statistical context.

Example 3

easy
A survey records each student's favorite color (red, blue, green). Is this data numerical or categorical?

Example 4

easy
Heights of students in centimeters are recorded. Is this numerical or categorical data?

Example 5

easy
Data is best described as what?

Example 6

easy
Yes/no answers to 'Do you own a pet?' are collected. What type of data is this?

Example 7

easy
A pollster collects 1,000,000 responses, but only from one biased website. Does the large size guarantee good conclusions?

Example 8

easy
Before analyzing a data set, what crucial thing should you understand about it?

Example 9

easy
Test scores of 85,90,7885, 90, 78 are recorded. What type of data are these values?

Example 10

easy
Zip codes are stored as numbers like 90210. Should they be treated as numerical data for averaging?

Example 11

medium
A researcher wants to study average commute time. They survey only people leaving a train station at 8 a.m. Name the data problem and its effect.

Example 12

medium
Classify each: (a) eye color, (b) number of siblings, (c) shirt size labeled S/M/L. Which is numerical?

Example 13

medium
A dataset of customer reviews contains both star ratings (1-5) and written comments. Which part is numerical and which is text/categorical?

Example 14

medium
A study claims a strong link between ice cream sales and drowning. The data was observational. Can it conclude ice cream causes drowning?

Example 15

medium
A spreadsheet lists temperatures but some cells are blank. What should an analyst do before computing the mean?

Example 16

medium
Population vs sample: a researcher measures all 30 students in one class to learn about the class. Is this a population or a sample?

Example 17

medium
A dataset records survey responses on a 1-5 agreement scale. Why is reporting the median often safer than the mean here?

Example 18

medium
Two datasets answer 'How tall are students?': one lists exact heights, the other lists 'short/medium/tall'. Which supports computing a mean height?

Example 19

challenge
A company reports its 'average' customer age as 35 from sign-up forms, but 40% of customers left the age field blank. Explain why the reported average may be untrustworthy.

Example 20

challenge
Classify the variable 'temperature in Celsius' and explain why ratios like '20 degrees is twice as warm as 10 degrees' are invalid.

Example 21

challenge
A dataset combines heights measured in inches (US branch) and centimeters (EU branch) in one column without labels. Why is this dataset unusable as-is, and what fixes it?

Example 22

medium
A dataset stores 'number of children' (0,1,2,...) and 'marital status' (single/married). Which variable can have a meaningful mean?

Example 23

easy
A class records each student's favorite fruit (apple, banana, orange). Is this numerical or categorical data?

Example 24

easy
True or false: a list of telephone numbers is numerical data you can average.

Example 25

easy
A pollster surveys 20 students out of 600. The 20 are a ___ and the 600 are the ___ .

Example 26

easy
A survey asks 'Are you a vegetarian?' and records yes/no. What type of data is this?

Example 27

medium
A teacher claims 'students who eat breakfast score higher on tests'. This conclusion came from observing one classroom. Why is this conclusion weak?

Example 28

medium
A dataset lists 'shirt size' as S, M, L, XL. Is this categorical or numerical? Can you order the categories?

Example 29

medium
A column lists ice cream sales (numeric) per day. Another lists 'season' (winter/spring/summer/fall). Which can you compute an average of?

Example 30

medium
A class collects everyone's height in inches but writes some as feet+inches. Why is the dataset hard to analyze before cleaning?

Example 31

medium
A 'satisfaction' question on a 1-5 scale: which is safer to report, the mean or the median?

Example 32

medium
A study finds higher cell-phone use correlates with lower test scores in 7th graders. Does the data show that phones cause lower scores?

Example 33

medium
Sort these into numerical or categorical: (a) ZIP code, (b) number of pets, (c) brand of phone, (d) heart rate (bpm).

Example 34

hard
A dataset has these 'age' entries: 12, 13, 14, 11, 999. Why might 999 be problematic and what should you do?

Example 35

hard
A poll reports 'Average household has 2.4 cars' from a sample of 100 homes in a wealthy suburb. Why is generalizing this to the whole country a mistake?

Example 36

hard
A spreadsheet of student records mixes upper- and lowercase ('Apple', 'apple', 'APPLE') in the favorite-fruit column. What problem does this cause for analysis?

Example 37

challenge
A teacher's gradebook has columns 'student', 'attendance %', 'exam score', 'preferred name'. Classify each variable type.