📊

Center Spread And Distributions

19 concepts in Statistics

Center, spread, and distributions help students move from “What are the data values?” to “What is this dataset like overall?” This topic covers measures of center such as mean, median, and mode; measures of spread such as range, mean absolute deviation, interquartile range, and standard deviation; and the language used to describe distributions, including shape, outliers, percentiles, z-scores, skewness, and the normal distribution. Students learn that no single summary is enough by itself: a mean without spread can hide instability, and a graph without context can hide what is typical. These ideas are the backbone of descriptive statistics and the bridge into probability models and inference.

Suggested learning path: Begin with mean, median, mode, and range, then study variability and quartiles before moving into standard deviation, distribution shape, percentiles, z-scores, and normal-distribution ideas.

Mean as Fair Share

The mean (average) represents what each person would get if the total were divided equally among everyone. It is calculated by adding all values and dividing by the count, giving a single number that summarizes the center of the data.

Median

The median is the middle value when all data points are arranged in order from smallest to largest. Half the values lie above it and half below. For an even number of values, the median is the average of the two middle values.

mean fair share

Mode

The mode is the value that appears most often in a data set. A set can have no mode (all values appear equally), one mode (unimodal), or multiple modes (bimodal or multimodal). It is the only measure of center that works for categorical data.

spread vs center

Range

The range is the difference between the maximum and minimum values in a data set, giving the simplest measure of overall spread. It tells you the total span of the data from lowest to highest in a single number.

spread vs center

Mean vs Median

Mean and median are both measures of center but respond differently to extreme values (outliers). The mean is pulled toward outliers because it uses every value in its calculation, while the median is resistant to outliers because it depends only on the middle position.

mean fair share

outlier detection

Spread vs Center

Center describes where the 'middle' of data lies; spread describes how far data extends from that center.

mean fair share

variability intro

Data Variability

Data variability describes how much the values in a data set are spread out or clustered together around the center. High variability means values are widely scattered; low variability means they are tightly grouped near the average.

mean fair share

Quartiles

Quartiles are values that divide ordered data into four equal parts: $Q_1$ (25th percentile) marks the boundary below which 25% of data falls, $Q_2$ (the median, 50th percentile) splits the data in half, and $Q_3$ (75th percentile) marks the boundary below which 75% falls.

Interquartile Range (IQR)

The interquartile range (IQR) is the range of the middle 50% of data, calculated as $Q_3 - Q_1$. It measures spread while ignoring the top and bottom 25% of values, making it resistant to outliers.

Mean Absolute Deviation (MAD)

The Mean Absolute Deviation (MAD) is the average of the absolute distances between each data point and the mean of the dataset. It measures how spread out data values are from the center, with larger MAD values indicating more variability.

mean fair share

Standard Deviation

Standard deviation is a measure of how spread out data values are from the mean, representing the typical distance of data points from the average. A small standard deviation means data clusters tightly around the mean; a large one means data is widely spread.

mean fair share

variability intro

Distribution Shape

Distribution shape describes the overall pattern of how data values are spread when displayed in a histogram or dot plot. Common shapes include symmetric (bell curve), skewed left, skewed right, uniform (all values equally common), and bimodal (two peaks).

Outlier Detection

Outlier detection is the process of identifying data points that are unusually far from the rest of the dataset, using techniques like the IQR rule, z-scores, or visual inspection of box plots and scatter plots. These anomalous values may indicate measurement errors, data entry mistakes, or genuinely extreme observations.

stat interquartile range

Percentiles

Percentiles are values that divide a ranked distribution into 100 equal parts. The $n$th percentile is the value below which $n\%$ of the data falls, telling you where a specific observation stands relative to the entire dataset.

Normal Distribution

The normal distribution (bell curve) is a symmetric, bell-shaped probability distribution where most data clusters around the mean, with probabilities decreasing symmetrically toward the tails. It is defined by two parameters: the mean and the standard deviation.

distribution shape

standard deviation intro

Z-Score (Standard Score)

A z-score tells you how many standard deviations a value is from the mean, calculated as $z = \frac{x - \mu}{\sigma}$. Positive z-scores are above the mean; negative z-scores are below. Z-scores allow comparison of values from different distributions.

standard deviation intro

mean fair share

Weighted Average

A weighted average is an average in which different values contribute unequally based on their assigned weights, reflecting the relative importance or frequency of each value. Unlike a simple average where all values count equally, a weighted average gives more influence to values with larger weights.

mean fair share

stat expected value

Empirical Rule

The empirical rule (also called the 68-95-99.7 rule) states that for a normal distribution, approximately 68% of data falls within one standard deviation of the mean, about 95% falls within two standard deviations, and roughly 99.7% falls within three standard deviations.

stat normal distribution

Skewness

A measure of how asymmetric a probability distribution is around its mean — positive skew tails right, negative skew tails left.

distribution shape

More Statistics Topics

Data Collection And Displays 16 concepts Probability And Chance 13 concepts Relationships And Regression 11 concepts Sampling Design And Inference 19 concepts