- Home
- Practice
- Statistics
- Practice
Statistics Practice
4,298 problems across 78 concepts. Free to try; Family unlocks the full worked solutions.
Center Spread And Distributions
Data Variability
50 QTwo archery targets both have average hits at the bullseye. But one archer's arrows are scattered all over, while the other's are clustered tightly. Same average, very different consistency. That difference is variability.
Distribution Shape
50 QIf you make a histogram, what shape emerges? A bell curve? A slope leaning one way? Two peaks? The shape tells you about what's typical and what's unusual in your data.
Empirical Rule
50 QMost data clusters near the center of a bell curve; the further from the mean, the rarer the value.
Interquartile Range (IQR)
50 QIQR focuses on where most of the data lives, ignoring the extremes. If regular range is how far the outliers stretched, IQR is how wide the main crowd is. More resistant to outliers than range.
Mean Absolute Deviation (MAD)
76 QFind how far each number is from the mean (ignoring +/-), then average those distances. It tells you: on average, how far is a typical value from the center?
Mean as Fair Share
50 QImagine 3 friends have 2, 4, and 9 candies. If they pool all candies (15 total) and share equally, each gets 5. That's the mean! It's the 'fair share' - what everyone would have if things were perfectly even.
Mean vs Median
50 QImagine a room with 10 people earning \$50,000 each. Mean and median are both \$50,000. Now a billionaire walks in. Mean jumps to \$91 million! But median stays around \$50,000. Mean is a pushover that gets bullied by extremes; median stands firm.
Median
50 QIf you lined up your whole class by height, the median height is the person standing exactly in the middle. It's not affected by whether the tallest kid is 5'5" or 7 feet - the middle person stays the same.
Mode
50 QThe mode is the most popular value - the one that shows up the most. If 5 kids pick pizza, 3 pick tacos, and 2 pick burgers, pizza is the mode because it's the favorite.
Normal Distribution
50 QHeights, test scores, measurement errors - many real phenomena cluster around an average with decreasing frequency toward extremes. The bell curve captures this pattern: most values are 'average,' few are extreme.
Outlier Detection
50 QOutliers are data points that don't fit the pattern. A 7-foot student in a class of average heights, or a \$10 million house in a neighborhood of \$300k homes. They may be errors or genuinely unusual.
Percentiles
50 QBeing in the 90th percentile means you scored better than 90% of people. It's not about your raw score - it's about your position relative to everyone else.
Quartiles
50 QIf you line up 100 people by height and divide into 4 equal groups, quartiles mark the dividing points. $Q_1$ is where the shortest 25% ends, $Q_2$ is the middle, $Q_3$ is where the tallest 25% begins.
Range
50 QRange tells you how spread out your data is from end to end. If the tallest kid is 5 feet and the shortest is 4 feet, the range is 1 foot - that's the 'stretch' of heights.
Skewness
50 QA right-skewed distribution has a long tail to the right (a few very large values); left-skewed has a long tail to the left.
Spread vs Center
50 QTwo pizza delivery services both average 30-minute delivery (same center). But Service A ranges 28-32 minutes, while Service B ranges 10-50 minutes. Same center, wildly different spread. You'd trust A for consistent timing.
Standard Deviation
50 QIf the mean is 'home base,' standard deviation tells you how far data points typically wander from home. Small SD = data clusters close to the mean (like a tight group of friends). Large SD = data is scattered (friends spread all over town).
Weighted Average
50 QYour final grade: exams count 60%, homework 40% β not every assignment counts equally.
Z-Score (Standard Score)
50 QZ-scores put everything on the same scale. A z-score of +2 means 'two standard deviations above average' - unusually high. A z-score of -1 means 'one SD below average' - somewhat low but normal.
Data Collection And Displays
Bar Graph
50 QThink of bar graphs as a competition where categories show off their numbers by how tall their bars stand. The taller the bar, the bigger the number. You can instantly see the winner without doing any math!
Box Plot
50 QA box plot is like an X-ray of your data's skeleton. The box shows where the middle 50% of data lives. The line inside is the median. The whiskers stretch to the extremes. You instantly see the center, spread, and any unusual values.
Categorical Data
50 QCategorical data puts things in boxes by type, not by how much. Your favorite color, pet type, or sport are categories - you can't average them, but you can count how many in each group.
Data Collection
50 QImagine you want to know your class's favorite ice cream flavor. You can't just guess - you need to actually ask everyone and write down their answers. That's data collection! It's like being a detective who gathers clues before solving a mystery.
Data Representation
50 QRaw data is like puzzle pieces scattered on a table - hard to make sense of. When you organize it into charts, graphs, or tables, the picture becomes clear. A bar chart of ice cream preferences instantly shows which flavor wins, while a list of 100 names wouldn't.
Dot Plot
50 QLike a line plot, but dots instead of X's. Each dot is one data point stacked above its value. The height of the stack shows frequency. Great for seeing clusters and gaps.
Frequency Table
50 QA frequency table is an organized list that answers 'how many?' for each category. Instead of a messy list of responses, you get a clean summary: Pizza-12, Tacos-8, Burgers-5.
Histogram
50 QUnlike bar graphs for categories (red, blue, green), histograms are for numbers grouped into ranges. Test scores 60-70, 70-80, 80-90... The bars touch because the data is continuous - there's no gap between 69.9 and 70.0.
Line Graph
50 QLine graphs are like following a hiking trail on a map - they show the journey of a number over time. Going up means increasing, going down means decreasing. The steeper the line, the faster the change.
Line Plot (Dot Plot)
50 QImagine a number line where every time someone picks a number, you stack an X above it. Taller stacks mean more people chose that number. You can quickly see which values are popular.
Misleading Graphs
50 QA bar that looks $3\times$ taller might only represent 10% more data if the axis doesn't start at zero. It's like taking a photo from a weird angle to make someone look taller. The data is true, but the picture lies.
Pictograph
50 QInstead of boring bars, pictographs use fun pictures to show data. If each smiley face means 2 students, and you see 5 smiley faces, that's 10 students! It makes data feel more real.
Pie Chart
50 QA pie chart works best when you want to ask βhow much of the whole belongs to each group?β The whole circle stands for 100%, and each slice shows one part of that whole.
Statistical Question
50 Q'How old is my teacher?' has ONE answer - not statistical. 'How old are teachers at my school?' will have DIFFERENT answers for each teacher - that's statistical! The key: do you expect variation?
Stem-and-Leaf Plot
50 QA stem-and-leaf plot is like a sorted list and a graph at the same time. You can see clusters, gaps, and repeated values without losing the exact numbers.
Tally Chart
50 QTally charts are like counting on your fingers, but on paper. Every time something happens, you draw a line. Cross every fifth line to make counting by 5s easy - like bundling sticks.
Probability And Chance
Addition Rule
50 QIf you want βA or B,β start by adding A and B. Then fix the double-counting by removing the part that belongs to both events.
Basic Probability
50 QProbability is a way of putting a number on chance. Flipping heads? That's $0.5$ (half the time). Rolling a 6 on a die? That's $\frac{1}{6}$ (one out of six possible outcomes). It's like asking 'if we did this many times, what fraction would this outcome happen?'
Compound Events
50 QSimple event: rolling a 6. Compound event: rolling a 6 AND then flipping heads. For 'and,' multiply probabilities. For 'or,' add them (but subtract overlap if any).
Conditional Probability
80 QOnce you know event B happened, you no longer look at every outcome. You only look at the part of the sample space where B is true, then ask how much of that smaller space also satisfies A.
Expected Value
50 QIf you played a game forever, expected value is your average result per play. Positive EV = profitable long-term. Negative EV = you'll lose over time. It's the mathematical way to evaluate risky decisions.
Experimental Probability
50 QYou flip a coin 100 times and get 53 heads. Your experimental probability is $\frac{53}{100} = 0.53$. It's based on what DID happen, not what should happen theoretically.
Independent Events
80 QIndependence means βno update.β If learning B happened leaves the chance of A exactly the same, then the events are independent.
Law of Large Numbers
50 QFlip a coin 10 times: maybe 7 heads (70%). Flip 100 times: closer to 50%. Flip 10,000 times: very close to 50%. More trials = more reliable averages. Short-run luck evens out.
Multiplication Rule
50 QFor an βandβ problem, move through the events in sequence. Take the chance of the first step, then update for the second step based on what is already known.
Sample Space
50 QBefore calculating probability, list every possible outcome. For a die: $\{1, 2, 3, 4, 5, 6\}$. For two coins: $\{HH, HT, TH, TT\}$. That's your sample space - the complete menu of what could happen.
Statistical Simulation
50 QCan't calculate the probability mathematically? Simulate it! Run the scenario thousands of times with random numbers and see what fraction of outcomes match your event. It's like conducting experiments without real resources.
Theoretical Probability
50 QFor a fair coin, you KNOW heads is $\frac{1}{2}$ without flipping. You calculate based on logic: 1 favorable outcome (heads) out of 2 possible outcomes. That's theoretical - it's what SHOULD happen.
Tree Diagram
50 QA tree diagram prevents you from losing cases when a probability problem unfolds in stages. Instead of guessing the outcomes, you build them step by step.
Relationships And Regression
Conditional Relative Frequency
50 QA two-way table becomes much more informative once you stop reading raw counts and start reading percentages within the relevant group.
Correlation
50 QWhen one thing goes up and another tends to go up with it (like study time and test scores), that's positive correlation. When one goes up and the other goes down (like TV time and exercise), that's negative correlation. They 'move together' in some pattern.
Correlation Coefficient
50 Qr = 1 means perfect positive line, r = β1 means perfect negative line, r = 0 means no linear pattern.
Correlation vs Causation
50 QIce cream sales and drowning deaths both increase in summer. Are ice creams deadly? No! A third factor (hot weather) causes both. This is why 'correlation $\neq$ causation' - just because things happen together doesn't mean one causes the other.
Line of Best Fit
50 QIf you stretched a rubber band through a scatter plot to be as close to all points as possible, that's the line of best fit. It captures the overall trend.
Linear Regression
50 QGiven scattered points, draw the 'best' line through them. 'Best' means the line that's closest to all points on average. This line lets you predict Y from X.
R-Squared (Coefficient of Determination)
76 Q$R^2 = 0.80$ means the model explains 80% of why $Y$ values differ. The other 20% is unexplained variation. Higher $R^2$ = better predictions.
Relative Frequency
50 QInstead of saying '15 students picked pizza,' you say '15 out of 50' or '30%.' Relative frequency compares to the whole, making different-sized groups comparable.
Residuals
76 QIf your model predicts 80 but the actual value is 85, the residual is +5. Residuals are 'leftovers' - what the model couldn't explain. Patterns in residuals reveal model problems.
Scatter Plot
50 QEach dot is a person (or item) plotted by TWO measurements - like height on one axis and weight on the other. Patterns in the dots reveal relationships: do taller people weigh more? The scatter tells the story.
Two-Way Tables
76 QA two-way table is like a spreadsheet that shows how two questions relate. 'Do you like pizza?' and 'Are you a kid or adult?' becomes a $2 \times 2$ grid showing how many kid pizza-lovers, adult pizza-lovers, etc.
Sampling Design And Inference
Blinding
50 QIf people know who got which treatment, they may behave differently, report differently, or evaluate differently. Blinding reduces that extra noise and bias.
Central Limit Theorem
76 QThis is statistics' magic trick: no matter how weird your population looks, if you take big enough samples and average them, those averages will form a bell curve. This is why normal distribution methods work so often.
Confidence Interval
76 QInstead of saying 'the average is 50,' you say 'I'm 95% confident the average is between 47 and 53.' The interval acknowledges uncertainty from sampling.
Confounding Variables
50 QIce cream sales and drowning deaths correlate. Confounding variable: hot weather. It causes both! Without recognizing confounders, you'd wrongly blame ice cream for drowning.
Control Group
50 QYou cannot tell whether a treatment had an effect unless you know what would have happened without it. The control group gives you that comparison point.
Experimental Design
76 QWant to know if a new fertilizer helps plants grow? You can't just use it on some plants and see if they grow - maybe they would've grown anyway! You need identical plants, give fertilizer to some (treatment) but not others (control), and keep everything else the same.
Hypothesis Testing
76 QHypothesis testing is like a courtroom trial for data. You start by assuming innocence (null hypothesis: nothing special is happening). Then you look at the evidence (data). If the evidence is strong enough to be very unlikely under the assumption of innocence, you reject it and conclude something real is happening.
Margin of Error
76 QWhen a poll says '52% $\pm$ 3%,' that 3% is the margin of error. It means the true value is probably within 3 percentage points of 52%, so between 49% and 55%.
Observational vs Experimental Studies
76 QObservational: Compare smokers to non-smokers (you didn't assign smoking). Experimental: Randomly assign people to take a drug or placebo (you controlled the treatment). Only experiments prove causation.
P-Value
76 QP-value answers: 'If nothing special is really happening, how surprising is my data?' A tiny p-value (like 0.01) means your results would be very rare if the null were true - so maybe the null is wrong. A large p-value means your results aren't surprising under the null.
Placebo Effect
50 QExpectations can change behavior and reported outcomes. That means a study can look successful even when the treatment itself is not the true cause.
Population vs Sample
76 QYou want to know the average height of ALL teenagers in your country (population), but you can't measure everyone. So you measure 1000 teenagers (sample) and use that to estimate the whole.
Random Assignment
50 QRandom sampling helps you generalize to a population. Random assignment helps you compare treatments fairly inside an experiment.
Random Sampling
50 QDrawing names from a hat where all names are equally likely to be picked. No favoritism, no convenience, just pure chance. This is how we ensure the sample represents the whole population, not just the easy-to-reach people.
Sampling Bias
50 QAsking only your friends about favorite music doesn't tell you what the whole school thinks - your friends probably have similar tastes! That's bias. A good sample is like a well-shuffled deck: everyone has an equal chance of being picked.
Sampling Distribution
76 QIf you took 1000 different random samples and calculated the mean of each, those 1000 means would form a distribution. That's the sampling distribution - it shows how sample statistics vary.
Sampling Variability
50 QIf you take two honest random samples, they can still disagree a little. That disagreement is not necessarily bias or a mistake; it is part of how sampling works.
Standard Error
50 QStandard error tells you how much your sample estimate might be 'off' from the true value. Larger samples have smaller SE because they're more precise - like asking 1000 people vs 10.
Statistical Significance
50 QStatistical significance is a decision rule: before looking at data, you set a threshold (usually 5%). If your p-value is below this threshold, you declare the result 'significant' - meaning unlikely to be just random noise. It's not about importance; it's about confidence that something real is happening.
Every problem ships with a self-check answer. The full worked solution, plus our 5-part recognition coaching (Setup βΈ Key insight βΈ Why it works βΈ Common pitfall βΈ Connection), is part of Family. See pricing β