Statistics Practice

4,298 problems across 78 concepts. Free to try; Family unlocks the full worked solutions.

Center Spread And Distributions

Data Variability

Two archery targets both have average hits at the bullseye. But one archer's arrows are scattered all over, while the other's are clustered tightly. Same average, very different consistency. That difference is variability.

Distribution Shape

50 Q

If you make a histogram, what shape emerges? A bell curve? A slope leaning one way? Two peaks? The shape tells you about what's typical and what's unusual in your data.

Empirical Rule

50 Q

Most data clusters near the center of a bell curve; the further from the mean, the rarer the value.

Interquartile Range (IQR)

50 Q

IQR focuses on where most of the data lives, ignoring the extremes. If regular range is how far the outliers stretched, IQR is how wide the main crowd is. More resistant to outliers than range.

Mean Absolute Deviation (MAD)

76 Q

Find how far each number is from the mean (ignoring +/-), then average those distances. It tells you: on average, how far is a typical value from the center?

Mean as Fair Share

50 Q

Imagine 3 friends have 2, 4, and 9 candies. If they pool all candies (15 total) and share equally, each gets 5. That's the mean! It's the 'fair share' - what everyone would have if things were perfectly even.

Mean vs Median

50 Q

Imagine a room with 10 people earning \$50,000 each. Mean and median are both \$50,000. Now a billionaire walks in. Mean jumps to \$91 million! But median stays around \$50,000. Mean is a pushover that gets bullied by extremes; median stands firm.

Median

50 Q

If you lined up your whole class by height, the median height is the person standing exactly in the middle. It's not affected by whether the tallest kid is 5'5" or 7 feet - the middle person stays the same.

Mode

50 Q

The mode is the most popular value - the one that shows up the most. If 5 kids pick pizza, 3 pick tacos, and 2 pick burgers, pizza is the mode because it's the favorite.

Normal Distribution

50 Q

Heights, test scores, measurement errors - many real phenomena cluster around an average with decreasing frequency toward extremes. The bell curve captures this pattern: most values are 'average,' few are extreme.

Outlier Detection

50 Q

Outliers are data points that don't fit the pattern. A 7-foot student in a class of average heights, or a \$10 million house in a neighborhood of \$300k homes. They may be errors or genuinely unusual.

Percentiles

50 Q

Being in the 90th percentile means you scored better than 90% of people. It's not about your raw score - it's about your position relative to everyone else.

Quartiles

50 Q

If you line up 100 people by height and divide into 4 equal groups, quartiles mark the dividing points. $Q_1$ is where the shortest 25% ends, $Q_2$ is the middle, $Q_3$ is where the tallest 25% begins.

Range

50 Q

Range tells you how spread out your data is from end to end. If the tallest kid is 5 feet and the shortest is 4 feet, the range is 1 foot - that's the 'stretch' of heights.

Skewness

50 Q

A right-skewed distribution has a long tail to the right (a few very large values); left-skewed has a long tail to the left.

Spread vs Center

50 Q

Two pizza delivery services both average 30-minute delivery (same center). But Service A ranges 28-32 minutes, while Service B ranges 10-50 minutes. Same center, wildly different spread. You'd trust A for consistent timing.

Standard Deviation

50 Q

If the mean is 'home base,' standard deviation tells you how far data points typically wander from home. Small SD = data clusters close to the mean (like a tight group of friends). Large SD = data is scattered (friends spread all over town).

Weighted Average

50 Q

Your final grade: exams count 60%, homework 40% — not every assignment counts equally.

Z-Score (Standard Score)

50 Q

Z-scores put everything on the same scale. A z-score of +2 means 'two standard deviations above average' - unusually high. A z-score of -1 means 'one SD below average' - somewhat low but normal.

Data Collection And Displays

Bar Graph

50 Q

Think of bar graphs as a competition where categories show off their numbers by how tall their bars stand. The taller the bar, the bigger the number. You can instantly see the winner without doing any math!

Box Plot

50 Q

A box plot is like an X-ray of your data's skeleton. The box shows where the middle 50% of data lives. The line inside is the median. The whiskers stretch to the extremes. You instantly see the center, spread, and any unusual values.

Categorical Data

50 Q

Categorical data puts things in boxes by type, not by how much. Your favorite color, pet type, or sport are categories - you can't average them, but you can count how many in each group.

Data Collection

50 Q

Imagine you want to know your class's favorite ice cream flavor. You can't just guess - you need to actually ask everyone and write down their answers. That's data collection! It's like being a detective who gathers clues before solving a mystery.

Data Representation

50 Q

Raw data is like puzzle pieces scattered on a table - hard to make sense of. When you organize it into charts, graphs, or tables, the picture becomes clear. A bar chart of ice cream preferences instantly shows which flavor wins, while a list of 100 names wouldn't.

Dot Plot

50 Q

Like a line plot, but dots instead of X's. Each dot is one data point stacked above its value. The height of the stack shows frequency. Great for seeing clusters and gaps.

Frequency Table

50 Q

A frequency table is an organized list that answers 'how many?' for each category. Instead of a messy list of responses, you get a clean summary: Pizza-12, Tacos-8, Burgers-5.

Histogram

50 Q

Unlike bar graphs for categories (red, blue, green), histograms are for numbers grouped into ranges. Test scores 60-70, 70-80, 80-90... The bars touch because the data is continuous - there's no gap between 69.9 and 70.0.

Line Graph

50 Q

Line graphs are like following a hiking trail on a map - they show the journey of a number over time. Going up means increasing, going down means decreasing. The steeper the line, the faster the change.

Line Plot (Dot Plot)

50 Q

Imagine a number line where every time someone picks a number, you stack an X above it. Taller stacks mean more people chose that number. You can quickly see which values are popular.

Misleading Graphs

50 Q

A bar that looks $3\times$ taller might only represent 10% more data if the axis doesn't start at zero. It's like taking a photo from a weird angle to make someone look taller. The data is true, but the picture lies.

Pictograph

50 Q

Instead of boring bars, pictographs use fun pictures to show data. If each smiley face means 2 students, and you see 5 smiley faces, that's 10 students! It makes data feel more real.

Pie Chart

50 Q

A pie chart works best when you want to ask “how much of the whole belongs to each group?” The whole circle stands for 100%, and each slice shows one part of that whole.

Statistical Question

50 Q

'How old is my teacher?' has ONE answer - not statistical. 'How old are teachers at my school?' will have DIFFERENT answers for each teacher - that's statistical! The key: do you expect variation?

Stem-and-Leaf Plot

50 Q

A stem-and-leaf plot is like a sorted list and a graph at the same time. You can see clusters, gaps, and repeated values without losing the exact numbers.

Tally Chart

50 Q

Tally charts are like counting on your fingers, but on paper. Every time something happens, you draw a line. Cross every fifth line to make counting by 5s easy - like bundling sticks.

Probability And Chance

Addition Rule

50 Q

If you want “A or B,” start by adding A and B. Then fix the double-counting by removing the part that belongs to both events.

Basic Probability

50 Q

Probability is a way of putting a number on chance. Flipping heads? That's $0.5$ (half the time). Rolling a 6 on a die? That's $\frac{1}{6}$ (one out of six possible outcomes). It's like asking 'if we did this many times, what fraction would this outcome happen?'

Compound Events

50 Q

Simple event: rolling a 6. Compound event: rolling a 6 AND then flipping heads. For 'and,' multiply probabilities. For 'or,' add them (but subtract overlap if any).

Conditional Probability

80 Q

Once you know event B happened, you no longer look at every outcome. You only look at the part of the sample space where B is true, then ask how much of that smaller space also satisfies A.

Expected Value

50 Q

If you played a game forever, expected value is your average result per play. Positive EV = profitable long-term. Negative EV = you'll lose over time. It's the mathematical way to evaluate risky decisions.

Experimental Probability

50 Q

You flip a coin 100 times and get 53 heads. Your experimental probability is $\frac{53}{100} = 0.53$. It's based on what DID happen, not what should happen theoretically.

Independent Events

80 Q

Independence means “no update.” If learning B happened leaves the chance of A exactly the same, then the events are independent.

Law of Large Numbers

50 Q

Flip a coin 10 times: maybe 7 heads (70%). Flip 100 times: closer to 50%. Flip 10,000 times: very close to 50%. More trials = more reliable averages. Short-run luck evens out.

Multiplication Rule

50 Q

For an “and” problem, move through the events in sequence. Take the chance of the first step, then update for the second step based on what is already known.

Sample Space

50 Q

Before calculating probability, list every possible outcome. For a die: $\{1, 2, 3, 4, 5, 6\}$. For two coins: $\{HH, HT, TH, TT\}$. That's your sample space - the complete menu of what could happen.

Statistical Simulation

50 Q

Can't calculate the probability mathematically? Simulate it! Run the scenario thousands of times with random numbers and see what fraction of outcomes match your event. It's like conducting experiments without real resources.

Theoretical Probability

50 Q

For a fair coin, you KNOW heads is $\frac{1}{2}$ without flipping. You calculate based on logic: 1 favorable outcome (heads) out of 2 possible outcomes. That's theoretical - it's what SHOULD happen.

Tree Diagram

50 Q

A tree diagram prevents you from losing cases when a probability problem unfolds in stages. Instead of guessing the outcomes, you build them step by step.

Relationships And Regression

Conditional Relative Frequency

50 Q

A two-way table becomes much more informative once you stop reading raw counts and start reading percentages within the relevant group.

Correlation

50 Q

When one thing goes up and another tends to go up with it (like study time and test scores), that's positive correlation. When one goes up and the other goes down (like TV time and exercise), that's negative correlation. They 'move together' in some pattern.

Correlation Coefficient

50 Q

r = 1 means perfect positive line, r = −1 means perfect negative line, r = 0 means no linear pattern.

Correlation vs Causation

50 Q

Ice cream sales and drowning deaths both increase in summer. Are ice creams deadly? No! A third factor (hot weather) causes both. This is why 'correlation $\neq$ causation' - just because things happen together doesn't mean one causes the other.

Line of Best Fit

50 Q

If you stretched a rubber band through a scatter plot to be as close to all points as possible, that's the line of best fit. It captures the overall trend.

Linear Regression

50 Q

Given scattered points, draw the 'best' line through them. 'Best' means the line that's closest to all points on average. This line lets you predict Y from X.

R-Squared (Coefficient of Determination)

76 Q

$R^2 = 0.80$ means the model explains 80% of why $Y$ values differ. The other 20% is unexplained variation. Higher $R^2$ = better predictions.

Relative Frequency

50 Q

Instead of saying '15 students picked pizza,' you say '15 out of 50' or '30%.' Relative frequency compares to the whole, making different-sized groups comparable.

Residuals

76 Q

If your model predicts 80 but the actual value is 85, the residual is +5. Residuals are 'leftovers' - what the model couldn't explain. Patterns in residuals reveal model problems.

Scatter Plot

50 Q

Each dot is a person (or item) plotted by TWO measurements - like height on one axis and weight on the other. Patterns in the dots reveal relationships: do taller people weigh more? The scatter tells the story.

Two-Way Tables

76 Q

A two-way table is like a spreadsheet that shows how two questions relate. 'Do you like pizza?' and 'Are you a kid or adult?' becomes a $2 \times 2$ grid showing how many kid pizza-lovers, adult pizza-lovers, etc.

Sampling Design And Inference

Blinding

50 Q

If people know who got which treatment, they may behave differently, report differently, or evaluate differently. Blinding reduces that extra noise and bias.

Central Limit Theorem

76 Q

This is statistics' magic trick: no matter how weird your population looks, if you take big enough samples and average them, those averages will form a bell curve. This is why normal distribution methods work so often.

Confidence Interval

76 Q

Instead of saying 'the average is 50,' you say 'I'm 95% confident the average is between 47 and 53.' The interval acknowledges uncertainty from sampling.

Confounding Variables

50 Q

Ice cream sales and drowning deaths correlate. Confounding variable: hot weather. It causes both! Without recognizing confounders, you'd wrongly blame ice cream for drowning.

Control Group

50 Q

You cannot tell whether a treatment had an effect unless you know what would have happened without it. The control group gives you that comparison point.

Experimental Design

76 Q

Want to know if a new fertilizer helps plants grow? You can't just use it on some plants and see if they grow - maybe they would've grown anyway! You need identical plants, give fertilizer to some (treatment) but not others (control), and keep everything else the same.

Hypothesis Testing

76 Q

Hypothesis testing is like a courtroom trial for data. You start by assuming innocence (null hypothesis: nothing special is happening). Then you look at the evidence (data). If the evidence is strong enough to be very unlikely under the assumption of innocence, you reject it and conclude something real is happening.

Margin of Error

76 Q

When a poll says '52% $\pm$ 3%,' that 3% is the margin of error. It means the true value is probably within 3 percentage points of 52%, so between 49% and 55%.

Observational vs Experimental Studies

76 Q

Observational: Compare smokers to non-smokers (you didn't assign smoking). Experimental: Randomly assign people to take a drug or placebo (you controlled the treatment). Only experiments prove causation.

P-Value

76 Q

P-value answers: 'If nothing special is really happening, how surprising is my data?' A tiny p-value (like 0.01) means your results would be very rare if the null were true - so maybe the null is wrong. A large p-value means your results aren't surprising under the null.

Placebo Effect

50 Q

Expectations can change behavior and reported outcomes. That means a study can look successful even when the treatment itself is not the true cause.

Population vs Sample

76 Q

You want to know the average height of ALL teenagers in your country (population), but you can't measure everyone. So you measure 1000 teenagers (sample) and use that to estimate the whole.

Random Assignment

50 Q

Random sampling helps you generalize to a population. Random assignment helps you compare treatments fairly inside an experiment.

Random Sampling

50 Q

Drawing names from a hat where all names are equally likely to be picked. No favoritism, no convenience, just pure chance. This is how we ensure the sample represents the whole population, not just the easy-to-reach people.

Sampling Bias

50 Q

Asking only your friends about favorite music doesn't tell you what the whole school thinks - your friends probably have similar tastes! That's bias. A good sample is like a well-shuffled deck: everyone has an equal chance of being picked.

Sampling Distribution

76 Q

If you took 1000 different random samples and calculated the mean of each, those 1000 means would form a distribution. That's the sampling distribution - it shows how sample statistics vary.

Sampling Variability

50 Q

If you take two honest random samples, they can still disagree a little. That disagreement is not necessarily bias or a mistake; it is part of how sampling works.

Standard Error

50 Q

Standard error tells you how much your sample estimate might be 'off' from the true value. Larger samples have smaller SE because they're more precise - like asking 1000 people vs 10.

Statistical Significance

50 Q

Statistical significance is a decision rule: before looking at data, you set a threshold (usually 5%). If your p-value is below this threshold, you declare the result 'significant' - meaning unlikely to be just random noise. It's not about importance; it's about confidence that something real is happening.

Every problem ships with a self-check answer. The full worked solution, plus our 5-part recognition coaching (Setup ▸ Key insight ▸ Why it works ▸ Common pitfall ▸ Connection), is part of Family. See pricing →