Correlation Formula

The Formula

r \text{ ranges from } -1 \text{ to } +1

When to use: Do two things go up and down together? r = +1 means perfectly together, r = -1 means perfectly opposite.

Quick Example

Height and weight: r \approx 0.7 (positive). Temperature and heating bill: r < 0 (negative).

Notation

r is the sample correlation coefficient; \rho (rho) is the population correlation

What This Formula Means

Correlation measures the strength and direction of the linear relationship between two quantitative variables, ranging from -1 to +1.

Do two things go up and down together? r = +1 means perfectly together, r = -1 means perfectly opposite.

Formal View

r = \frac{1}{n-1}\sum_{i=1}^{n}\left(\frac{x_i - \bar{x}}{s_x}\right)\left(\frac{y_i - \bar{y}}{s_y}\right) where -1 \leq r \leq 1

Worked Examples

Example 1

medium
Given five data points (1,2), (2,4), (3,5), (4,4), (5,5), compute the Pearson correlation coefficient r.

Solution

  1. 1
    Compute means: \bar{x} = 3, \bar{y} = 4.
  2. 2
    Compute \sum(x_i - \bar{x})(y_i - \bar{y}): (-2)(-2) + (-1)(0) + (0)(1) + (1)(0) + (2)(1) = 4 + 0 + 0 + 0 + 2 = 6.
  3. 3
    Compute \sum(x_i - \bar{x})^2 = 4 + 1 + 0 + 1 + 4 = 10 and \sum(y_i - \bar{y})^2 = 4 + 0 + 1 + 0 + 1 = 6.
  4. 4
    r = \frac{6}{\sqrt{10 \times 6}} = \frac{6}{\sqrt{60}} = \frac{6}{7.746} \approx 0.775.

Answer

r \approx 0.775
The Pearson correlation coefficient r ranges from -1 to 1. A value of 0.775 indicates a strong positive linear relationship between x and y.

Example 2

easy
A study finds r = -0.85 between hours of TV watched per day and exam scores. Interpret this value.

Common Mistakes

  • Concluding that correlation implies causation โ€” two variables can correlate because of a lurking third variable
  • Assuming r = 0 means no relationship at all โ€” it means no LINEAR relationship; a strong curved relationship can have r \approx 0
  • Interpreting r = 0.5 as 'halfway to perfect correlation' โ€” r^2 = 0.25, so only 25\% of variation is explained

Why This Formula Matters

Correlation is the first tool for exploring relationships between variables โ€” but it only captures linear association, not causation or nonlinear patterns.

Frequently Asked Questions

What is the Correlation formula?

Correlation measures the strength and direction of the linear relationship between two quantitative variables, ranging from -1 to +1.

How do you use the Correlation formula?

Do two things go up and down together? r = +1 means perfectly together, r = -1 means perfectly opposite.

What do the symbols mean in the Correlation formula?

r is the sample correlation coefficient; \rho (rho) is the population correlation

Why is the Correlation formula important in Math?

Correlation is the first tool for exploring relationships between variables โ€” but it only captures linear association, not causation or nonlinear patterns.

What do students get wrong about Correlation?

Correlation does not imply causation. Ice cream sales and drownings both correlate with summer.

What should I learn before the Correlation formula?

Before studying the Correlation formula, you should understand: mean, standard deviation.