Statistics Explorer

Search and explore 78 statistics concepts

Browse 78 statistics concepts covering data collection, distributions, probability, inference, and hypothesis testing. Each concept includes a plain-language definition, an intuition for when and why the idea is useful, common pitfalls, and connections to prerequisite and follow-on topics — so you can build statistical reasoning step by step.

Data Collection

Imagine you want to know your class's favorite ice cream flavor. You can't just guess - you need to actually ask everyone and write down their answers. That's data collection! It's like being a detective who gathers clues before solving a mystery.

Statistical Question

'How old is my teacher?' has ONE answer - not statistical. 'How old are teachers at my school?' will have DIFFERENT answers for each teacher - that's statistical! The key: do you expect variation?

Categorical Data

Categorical data puts things in boxes by type, not by how much. Your favorite color, pet type, or sport are categories - you can't average them, but you can count how many in each group.

Data Representation

Raw data is like puzzle pieces scattered on a table - hard to make sense of. When you organize it into charts, graphs, or tables, the picture becomes clear. A bar chart of ice cream preferences instantly shows which flavor wins, while a list of 100 names wouldn't.

Tally Chart

Tally charts are like counting on your fingers, but on paper. Every time something happens, you draw a line. Cross every fifth line to make counting by 5s easy - like bundling sticks.

Frequency Table

A frequency table is an organized list that answers 'how many?' for each category. Instead of a messy list of responses, you get a clean summary: Pizza-12, Tacos-8, Burgers-5.

Pictograph

Instead of boring bars, pictographs use fun pictures to show data. If each smiley face means 2 students, and you see 5 smiley faces, that's 10 students! It makes data feel more real.

Bar Graph

Think of bar graphs as a competition where categories show off their numbers by how tall their bars stand. The taller the bar, the bigger the number. You can instantly see the winner without doing any math!

Line Graph

Line graphs are like following a hiking trail on a map - they show the journey of a number over time. Going up means increasing, going down means decreasing. The steeper the line, the faster the change.

Line Plot (Dot Plot)

Imagine a number line where every time someone picks a number, you stack an X above it. Taller stacks mean more people chose that number. You can quickly see which values are popular.

Dot Plot

Like a line plot, but dots instead of X's. Each dot is one data point stacked above its value. The height of the stack shows frequency. Great for seeing clusters and gaps.

Pie Chart

A pie chart works best when you want to ask “how much of the whole belongs to each group?” The whole circle stands for 100%, and each slice shows one part of that whole.

Stem-and-Leaf Plot

A stem-and-leaf plot is like a sorted list and a graph at the same time. You can see clusters, gaps, and repeated values without losing the exact numbers.

Histogram

Unlike bar graphs for categories (red, blue, green), histograms are for numbers grouped into ranges. Test scores 60-70, 70-80, 80-90... The bars touch because the data is continuous - there's no gap between 69.9 and 70.0.

Box Plot

A box plot is like an X-ray of your data's skeleton. The box shows where the middle 50% of data lives. The line inside is the median. The whiskers stretch to the extremes. You instantly see the center, spread, and any unusual values.

Misleading Graphs

A bar that looks $3\times$ taller might only represent 10% more data if the axis doesn't start at zero. It's like taking a photo from a weird angle to make someone look taller. The data is true, but the picture lies.

Mean as Fair Share

Imagine 3 friends have 2, 4, and 9 candies. If they pool all candies (15 total) and share equally, each gets 5. That's the mean! It's the 'fair share' - what everyone would have if things were perfectly even.

Median

If you lined up your whole class by height, the median height is the person standing exactly in the middle. It's not affected by whether the tallest kid is 5'5" or 7 feet - the middle person stays the same.

Mode

The mode is the most popular value - the one that shows up the most. If 5 kids pick pizza, 3 pick tacos, and 2 pick burgers, pizza is the mode because it's the favorite.

Range

Range tells you how spread out your data is from end to end. If the tallest kid is 5 feet and the shortest is 4 feet, the range is 1 foot - that's the 'stretch' of heights.

Mean vs Median

Imagine a room with 10 people earning \$50,000 each. Mean and median are both \$50,000. Now a billionaire walks in. Mean jumps to \$91 million! But median stays around \$50,000. Mean is a pushover that gets bullied by extremes; median stands firm.

Spread vs Center

Two pizza delivery services both average 30-minute delivery (same center). But Service A ranges 28-32 minutes, while Service B ranges 10-50 minutes. Same center, wildly different spread. You'd trust A for consistent timing.

Data Variability

Two archery targets both have average hits at the bullseye. But one archer's arrows are scattered all over, while the other's are clustered tightly. Same average, very different consistency. That difference is variability.

Quartiles

If you line up 100 people by height and divide into 4 equal groups, quartiles mark the dividing points. $Q_1$ is where the shortest 25% ends, $Q_2$ is the middle, $Q_3$ is where the tallest 25% begins.

Interquartile Range (IQR)

IQR focuses on where most of the data lives, ignoring the extremes. If regular range is how far the outliers stretched, IQR is how wide the main crowd is. More resistant to outliers than range.

Mean Absolute Deviation (MAD)

Find how far each number is from the mean (ignoring +/-), then average those distances. It tells you: on average, how far is a typical value from the center?

Standard Deviation

If the mean is 'home base,' standard deviation tells you how far data points typically wander from home. Small SD = data clusters close to the mean (like a tight group of friends). Large SD = data is scattered (friends spread all over town).

Distribution Shape

If you make a histogram, what shape emerges? A bell curve? A slope leaning one way? Two peaks? The shape tells you about what's typical and what's unusual in your data.

Outlier Detection

Outliers are data points that don't fit the pattern. A 7-foot student in a class of average heights, or a \$10 million house in a neighborhood of \$300k homes. They may be errors or genuinely unusual.

Percentiles

Being in the 90th percentile means you scored better than 90% of people. It's not about your raw score - it's about your position relative to everyone else.

Normal Distribution

Heights, test scores, measurement errors - many real phenomena cluster around an average with decreasing frequency toward extremes. The bell curve captures this pattern: most values are 'average,' few are extreme.

Z-Score (Standard Score)

Z-scores put everything on the same scale. A z-score of +2 means 'two standard deviations above average' - unusually high. A z-score of -1 means 'one SD below average' - somewhat low but normal.

Weighted Average

Your final grade: exams count 60%, homework 40% — not every assignment counts equally.

Empirical Rule

Most data clusters near the center of a bell curve; the further from the mean, the rarer the value.

Skewness

A right-skewed distribution has a long tail to the right (a few very large values); left-skewed has a long tail to the left.

Random Sampling

Drawing names from a hat where all names are equally likely to be picked. No favoritism, no convenience, just pure chance. This is how we ensure the sample represents the whole population, not just the easy-to-reach people.

Population vs Sample

You want to know the average height of ALL teenagers in your country (population), but you can't measure everyone. So you measure 1000 teenagers (sample) and use that to estimate the whole.

Sampling Bias

Asking only your friends about favorite music doesn't tell you what the whole school thinks - your friends probably have similar tastes! That's bias. A good sample is like a well-shuffled deck: everyone has an equal chance of being picked.

Experimental Design

Want to know if a new fertilizer helps plants grow? You can't just use it on some plants and see if they grow - maybe they would've grown anyway! You need identical plants, give fertilizer to some (treatment) but not others (control), and keep everything else the same.

Observational vs Experimental Studies

Observational: Compare smokers to non-smokers (you didn't assign smoking). Experimental: Randomly assign people to take a drug or placebo (you controlled the treatment). Only experiments prove causation.

Confounding Variables

Ice cream sales and drowning deaths correlate. Confounding variable: hot weather. It causes both! Without recognizing confounders, you'd wrongly blame ice cream for drowning.

Random Assignment

Random sampling helps you generalize to a population. Random assignment helps you compare treatments fairly inside an experiment.

Control Group

You cannot tell whether a treatment had an effect unless you know what would have happened without it. The control group gives you that comparison point.

Placebo Effect

Expectations can change behavior and reported outcomes. That means a study can look successful even when the treatment itself is not the true cause.

Blinding

If people know who got which treatment, they may behave differently, report differently, or evaluate differently. Blinding reduces that extra noise and bias.

Sampling Variability

If you take two honest random samples, they can still disagree a little. That disagreement is not necessarily bias or a mistake; it is part of how sampling works.

Sampling Distribution

If you took 1000 different random samples and calculated the mean of each, those 1000 means would form a distribution. That's the sampling distribution - it shows how sample statistics vary.

Central Limit Theorem

This is statistics' magic trick: no matter how weird your population looks, if you take big enough samples and average them, those averages will form a bell curve. This is why normal distribution methods work so often.

Standard Error

Standard error tells you how much your sample estimate might be 'off' from the true value. Larger samples have smaller SE because they're more precise - like asking 1000 people vs 10.

Confidence Interval

Instead of saying 'the average is 50,' you say 'I'm 95% confident the average is between 47 and 53.' The interval acknowledges uncertainty from sampling.

Margin of Error

When a poll says '52% $\pm$ 3%,' that 3% is the margin of error. It means the true value is probably within 3 percentage points of 52%, so between 49% and 55%.

Hypothesis Testing

Hypothesis testing is like a courtroom trial for data. You start by assuming innocence (null hypothesis: nothing special is happening). Then you look at the evidence (data). If the evidence is strong enough to be very unlikely under the assumption of innocence, you reject it and conclude something real is happening.

P-Value

P-value answers: 'If nothing special is really happening, how surprising is my data?' A tiny p-value (like 0.01) means your results would be very rare if the null were true - so maybe the null is wrong. A large p-value means your results aren't surprising under the null.

Statistical Significance

Statistical significance is a decision rule: before looking at data, you set a threshold (usually 5%). If your p-value is below this threshold, you declare the result 'significant' - meaning unlikely to be just random noise. It's not about importance; it's about confidence that something real is happening.

Two-Way Tables

A two-way table is like a spreadsheet that shows how two questions relate. 'Do you like pizza?' and 'Are you a kid or adult?' becomes a $2 \times 2$ grid showing how many kid pizza-lovers, adult pizza-lovers, etc.

Relative Frequency

Instead of saying '15 students picked pizza,' you say '15 out of 50' or '30%.' Relative frequency compares to the whole, making different-sized groups comparable.

Conditional Relative Frequency

A two-way table becomes much more informative once you stop reading raw counts and start reading percentages within the relevant group.

Correlation

When one thing goes up and another tends to go up with it (like study time and test scores), that's positive correlation. When one goes up and the other goes down (like TV time and exercise), that's negative correlation. They 'move together' in some pattern.

Correlation vs Causation

Ice cream sales and drowning deaths both increase in summer. Are ice creams deadly? No! A third factor (hot weather) causes both. This is why 'correlation $\neq$ causation' - just because things happen together doesn't mean one causes the other.

Scatter Plot

Each dot is a person (or item) plotted by TWO measurements - like height on one axis and weight on the other. Patterns in the dots reveal relationships: do taller people weigh more? The scatter tells the story.

Line of Best Fit

If you stretched a rubber band through a scatter plot to be as close to all points as possible, that's the line of best fit. It captures the overall trend.

Linear Regression

Given scattered points, draw the 'best' line through them. 'Best' means the line that's closest to all points on average. This line lets you predict Y from X.

Residuals

If your model predicts 80 but the actual value is 85, the residual is +5. Residuals are 'leftovers' - what the model couldn't explain. Patterns in residuals reveal model problems.

R-Squared (Coefficient of Determination)

$R^2 = 0.80$ means the model explains 80% of why $Y$ values differ. The other 20% is unexplained variation. Higher $R^2$ = better predictions.

Correlation Coefficient

r = 1 means perfect positive line, r = −1 means perfect negative line, r = 0 means no linear pattern.

Basic Probability

Probability is a way of putting a number on chance. Flipping heads? That's $0.5$ (half the time). Rolling a 6 on a die? That's $\frac{1}{6}$ (one out of six possible outcomes). It's like asking 'if we did this many times, what fraction would this outcome happen?'

Theoretical Probability

For a fair coin, you KNOW heads is $\frac{1}{2}$ without flipping. You calculate based on logic: 1 favorable outcome (heads) out of 2 possible outcomes. That's theoretical - it's what SHOULD happen.

Experimental Probability

You flip a coin 100 times and get 53 heads. Your experimental probability is $\frac{53}{100} = 0.53$. It's based on what DID happen, not what should happen theoretically.

Sample Space

Before calculating probability, list every possible outcome. For a die: $\{1, 2, 3, 4, 5, 6\}$. For two coins: $\{HH, HT, TH, TT\}$. That's your sample space - the complete menu of what could happen.

Tree Diagram

A tree diagram prevents you from losing cases when a probability problem unfolds in stages. Instead of guessing the outcomes, you build them step by step.

Compound Events

Simple event: rolling a 6. Compound event: rolling a 6 AND then flipping heads. For 'and,' multiply probabilities. For 'or,' add them (but subtract overlap if any).

Conditional Probability

Once you know event B happened, you no longer look at every outcome. You only look at the part of the sample space where B is true, then ask how much of that smaller space also satisfies A.

Independent Events

Independence means “no update.” If learning B happened leaves the chance of A exactly the same, then the events are independent.

Addition Rule

If you want “A or B,” start by adding A and B. Then fix the double-counting by removing the part that belongs to both events.

Multiplication Rule

For an “and” problem, move through the events in sequence. Take the chance of the first step, then update for the second step based on what is already known.

Statistical Simulation

Can't calculate the probability mathematically? Simulate it! Run the scenario thousands of times with random numbers and see what fraction of outcomes match your event. It's like conducting experiments without real resources.

Law of Large Numbers

Flip a coin 10 times: maybe 7 heads (70%). Flip 100 times: closer to 50%. Flip 10,000 times: very close to 50%. More trials = more reliable averages. Short-run luck evens out.

Expected Value

If you played a game forever, expected value is your average result per play. Positive EV = profitable long-term. Negative EV = you'll lose over time. It's the mathematical way to evaluate risky decisions.

78 concepts available