Correlation Coefficient Formula

The Formula

r = \frac{\sum(x_i - \bar{x})(y_i - \bar{y})}{\sqrt{\sum(x_i-\bar{x})^2 \sum(y_i-\bar{y})^2}}

When to use: r = 1 means perfect positive line, r = โˆ’1 means perfect negative line, r = 0 means no linear pattern.

Quick Example

Height and weight: r โ‰ˆ 0.7, a moderate positive correlation โ€” taller people tend to weigh more.

What This Formula Means

The correlation coefficient (Pearson's r) is a number between โˆ’1 and 1 that measures both the strength and direction of the linear relationship between two quantitative variables. A value of 1 indicates a perfect positive linear relationship, โˆ’1 a perfect negative linear relationship, and 0 no linear relationship at all.

r = 1 means perfect positive line, r = โˆ’1 means perfect negative line, r = 0 means no linear pattern.

Formal View

For paired observations (x_i, y_i), Pearson's correlation coefficient is r = \frac{\sum_{i=1}^{n}(x_i - \bar{x})(y_i - \bar{y})}{\sqrt{\sum_{i=1}^{n}(x_i - \bar{x})^2 \cdot \sum_{i=1}^{n}(y_i - \bar{y})^2}}, where r \in [-1, 1].

Common Mistakes

  • Assuming r measures nonlinear relationships
  • Confusing correlation with causation
  • Ignoring outliers that inflate or deflate r

Why This Formula Matters

The correlation coefficient is one of the most widely reported statistics in science, social science, medicine, and business, used to quantify how strongly two variables move together.

Frequently Asked Questions

What is the Correlation Coefficient formula?

The correlation coefficient (Pearson's r) is a number between โˆ’1 and 1 that measures both the strength and direction of the linear relationship between two quantitative variables. A value of 1 indicates a perfect positive linear relationship, โˆ’1 a perfect negative linear relationship, and 0 no linear relationship at all.

How do you use the Correlation Coefficient formula?

r = 1 means perfect positive line, r = โˆ’1 means perfect negative line, r = 0 means no linear pattern.

Why is the Correlation Coefficient formula important in Statistics?

The correlation coefficient is one of the most widely reported statistics in science, social science, medicine, and business, used to quantify how strongly two variables move together.

What do students get wrong about Correlation Coefficient?

Correlation does not imply causation โ€” two variables can be correlated for unrelated reasons.

What should I learn before the Correlation Coefficient formula?

Before studying the Correlation Coefficient formula, you should understand: correlation intro, line of best fit.