R-Squared (Coefficient of Determination)

Model Assessment
definition

Grade 9-12

View on concept map

R-squared (the coefficient of determination) is the proportion of variance in the dependent variable that is explained by the independent variable(s) in a regression model. R^2 is the standard measure of how well a regression model fits data.

Definition

R-squared (the coefficient of determination) is the proportion of variance in the dependent variable that is explained by the independent variable(s) in a regression model. It ranges from 0 to 1, where 0 means the model explains none of the variability and 1 means it explains all of it.

๐Ÿ’ก Intuition

R^2 = 0.80 means the model explains 80% of why Y values differ. The other 20% is unexplained variation. Higher R^2 = better predictions.

๐ŸŽฏ Core Idea

R-squared is the proportion of variability in Y that is explained by the regression model. An R-squared of 0.80 means 80% of the variation is accounted for.

Example

Height explains 70% of weight variation (R^2 = 0.70). The remaining 30% is due to other factors like diet and muscle mass.

๐ŸŒŸ Why It Matters

R^2 is the standard measure of how well a regression model fits data. It helps compare competing models, assess prediction quality, and communicate model performance to non-technical stakeholders.

๐Ÿ’ญ Hint When Stuck

To compute R-squared, first calculate the total sum of squares SS_{\text{tot}} = \sum(y_i - \bar{y})^2 and the residual sum of squares SS_{\text{res}} = \sum(y_i - \hat{y}_i)^2. Then divide: R^2 = 1 - SS_{\text{res}} / SS_{\text{tot}}. A value near 1 means the model captures most variation; near 0 means it explains very little.

Formal View

Given observed values y_i, predicted values \hat{y}_i, and mean \bar{y}, the coefficient of determination is R^2 = 1 - \frac{SS_{\text{res}}}{SS_{\text{tot}}} = 1 - \frac{\sum_{i=1}^{n}(y_i - \hat{y}_i)^2}{\sum_{i=1}^{n}(y_i - \bar{y})^2}, where 0 \leq R^2 \leq 1 for models with an intercept.

Compare With Similar Concepts

๐Ÿšง Common Stuck Point

Students think R-squared tells you if the model is correct. A high R-squared can result from overfitting or a spurious relationship โ€” always check residuals too.

โš ๏ธ Common Mistakes

  • Thinking R^2 = 1 is always good (overfitting)
  • Comparing R^2 across different datasets
  • Confusing with correlation r

Frequently Asked Questions

What is R-Squared (Coefficient of Determination) in Statistics?

R-squared (the coefficient of determination) is the proportion of variance in the dependent variable that is explained by the independent variable(s) in a regression model. It ranges from 0 to 1, where 0 means the model explains none of the variability and 1 means it explains all of it.

When do you use R-Squared (Coefficient of Determination)?

To compute R-squared, first calculate the total sum of squares SS_{\text{tot}} = \sum(y_i - \bar{y})^2 and the residual sum of squares SS_{\text{res}} = \sum(y_i - \hat{y}_i)^2. Then divide: R^2 = 1 - SS_{\text{res}} / SS_{\text{tot}}. A value near 1 means the model captures most variation; near 0 means it explains very little.

What do students usually get wrong about R-Squared (Coefficient of Determination)?

Students think R-squared tells you if the model is correct. A high R-squared can result from overfitting or a spurious relationship โ€” always check residuals too.

How R-Squared (Coefficient of Determination) Connects to Other Ideas

To understand r-squared (coefficient of determination), you should first be comfortable with linear regression and standard deviation intro.