Correlation Coefficient

Relationships
definition

Also known as: r, Pearson's r, r-value

Grade 9-12

View on concept map

The correlation coefficient (Pearson's r) is a number between −1 and 1 that measures both the strength and direction of the linear relationship between two quantitative variables. The correlation coefficient is one of the most widely reported statistics in science, social science, medicine, and business, used to quantify how strongly two variables move together.

Definition

The correlation coefficient (Pearson's r) is a number between −1 and 1 that measures both the strength and direction of the linear relationship between two quantitative variables. A value of 1 indicates a perfect positive linear relationship, −1 a perfect negative linear relationship, and 0 no linear relationship at all.

💡 Intuition

r = 1 means perfect positive line, r = −1 means perfect negative line, r = 0 means no linear pattern.

🎯 Core Idea

The correlation coefficient quantifies only linear relationships; nonlinear patterns can have r ≈ 0.

Example

Height and weight: r ≈ 0.7, a moderate positive correlation — taller people tend to weigh more.

Formula

r = \frac{\sum(x_i - \bar{x})(y_i - \bar{y})}{\sqrt{\sum(x_i-\bar{x})^2 \sum(y_i-\bar{y})^2}}

🌟 Why It Matters

The correlation coefficient is one of the most widely reported statistics in science, social science, medicine, and business, used to quantify how strongly two variables move together.

💭 Hint When Stuck

To compute Pearson's r, first standardize each variable by subtracting its mean and dividing by its standard deviation. Then multiply corresponding z-scores, sum them, and divide by n - 1. Check the sign (positive or negative) for direction and the magnitude (closer to 1 or −1 means stronger) for strength.

Formal View

For paired observations (x_i, y_i), Pearson's correlation coefficient is r = \frac{\sum_{i=1}^{n}(x_i - \bar{x})(y_i - \bar{y})}{\sqrt{\sum_{i=1}^{n}(x_i - \bar{x})^2 \cdot \sum_{i=1}^{n}(y_i - \bar{y})^2}}, where r \in [-1, 1].

Compare With Similar Concepts

🚧 Common Stuck Point

Correlation does not imply causation — two variables can be correlated for unrelated reasons.

⚠️ Common Mistakes

  • Assuming r measures nonlinear relationships
  • Confusing correlation with causation
  • Ignoring outliers that inflate or deflate r

Frequently Asked Questions

What is Correlation Coefficient in Statistics?

The correlation coefficient (Pearson's r) is a number between −1 and 1 that measures both the strength and direction of the linear relationship between two quantitative variables. A value of 1 indicates a perfect positive linear relationship, −1 a perfect negative linear relationship, and 0 no linear relationship at all.

What is the Correlation Coefficient formula?

r = \frac{\sum(x_i - \bar{x})(y_i - \bar{y})}{\sqrt{\sum(x_i-\bar{x})^2 \sum(y_i-\bar{y})^2}}

When do you use Correlation Coefficient?

To compute Pearson's r, first standardize each variable by subtracting its mean and dividing by its standard deviation. Then multiply corresponding z-scores, sum them, and divide by n - 1. Check the sign (positive or negative) for direction and the magnitude (closer to 1 or −1 means stronger) for strength.

How Correlation Coefficient Connects to Other Ideas

To understand correlation coefficient, you should first be comfortable with correlation intro and line of best fit. Once you have a solid grasp of correlation coefficient, you can move on to linear regression.