Central Limit Theorem

Statistics
principle

Also known as: CLT

Grade 9-12

View on concept map

For sufficiently large sample size (n \geq 30 as a rule of thumb), the sampling distribution of the sample mean is approximately normal with mean \mu and standard deviation \frac{\sigma}{\sqrt{n}}, regardless of the shape of the population distribution. Without the CLT, we'd need to know the exact population distribution before doing any inference.

Definition

For sufficiently large sample size (n \geq 30 as a rule of thumb), the sampling distribution of the sample mean is approximately normal with mean \mu and standard deviation \frac{\sigma}{\sqrt{n}}, regardless of the shape of the population distribution.

๐Ÿ’ก Intuition

Roll a single die and the outcomes are flat (uniform). But average the rolls of 30 dice and the result looks like a bell curve every time. No matter how weird the original data looksโ€”skewed, bimodal, flatโ€”the averages of large enough samples always settle into a normal shape. It's one of the most surprising facts in all of mathematics.

๐ŸŽฏ Core Idea

The CLT is why the normal distribution dominates statistics: it guarantees that sample means are approximately normal for large n, giving us a universal framework for inference.

Example

Population of die rolls: uniform on \{1, 2, 3, 4, 5, 6\}. Take samples of n = 35 rolls and compute \bar{x} each time. \bar{X} \sim N\left(3.5,\; \frac{1.71}{\sqrt{35}}\right) \approx N(3.5,\; 0.289)

Formula

\bar{X} \sim N\left(\mu,\; \frac{\sigma}{\sqrt{n}}\right)

Notation

\frac{\sigma}{\sqrt{n}} is called the standard error of the mean.

๐ŸŒŸ Why It Matters

Without the CLT, we'd need to know the exact population distribution before doing any inference. The CLT lets us use normal-based methods (z-tests, confidence intervals) even when the population is non-normal.

Formal View

If X_1, \ldots, X_n are i.i.d. with mean \mu and variance \sigma^2, then \frac{\bar{X}_n - \mu}{\sigma/\sqrt{n}} \xrightarrow{d} N(0, 1) as n \to \infty

See Also

๐Ÿšง Common Stuck Point

The CLT applies to sample means (and sums), not to individual observations. A single data point from a skewed population is still skewed.

โš ๏ธ Common Mistakes

  • Thinking n \geq 30 is a hard ruleโ€”highly skewed populations may need larger samples for the CLT to kick in.
  • Confusing \sigma (population SD) with \frac{\sigma}{\sqrt{n}} (standard error)โ€”the spread of sample means shrinks by \sqrt{n}.
  • Applying the CLT to individual observations rather than to the sample mean or sample sum.

Frequently Asked Questions

What is Central Limit Theorem in Math?

For sufficiently large sample size (n \geq 30 as a rule of thumb), the sampling distribution of the sample mean is approximately normal with mean \mu and standard deviation \frac{\sigma}{\sqrt{n}}, regardless of the shape of the population distribution.

Why is Central Limit Theorem important?

Without the CLT, we'd need to know the exact population distribution before doing any inference. The CLT lets us use normal-based methods (z-tests, confidence intervals) even when the population is non-normal.

What do students usually get wrong about Central Limit Theorem?

The CLT applies to sample means (and sums), not to individual observations. A single data point from a skewed population is still skewed.

What should I learn before Central Limit Theorem?

Before studying Central Limit Theorem, you should understand: sampling distribution, normal distribution.

How Central Limit Theorem Connects to Other Ideas

To understand central limit theorem, you should first be comfortable with sampling distribution and normal distribution. Once you have a solid grasp of central limit theorem, you can move on to confidence interval and hypothesis testing.