- Home
- /
- Statistics
- /
- relationships and regression
- /
- R-Squared (Coefficient of Determination)
R-squared (the coefficient of determination) is the proportion of variance in the dependent variable that is explained by the independent variable(s) in a regression model. R^2 is the standard measure of how well a regression model fits data.
Definition
R-squared (the coefficient of determination) is the proportion of variance in the dependent variable that is explained by the independent variable(s) in a regression model. It ranges from 0 to 1, where 0 means the model explains none of the variability and 1 means it explains all of it.
๐ก Intuition
R^2 = 0.80 means the model explains 80% of why Y values differ. The other 20% is unexplained variation. Higher R^2 = better predictions.
๐ฏ Core Idea
R-squared is the proportion of variability in Y that is explained by the regression model. An R-squared of 0.80 means 80% of the variation is accounted for.
Example
๐ Why It Matters
R^2 is the standard measure of how well a regression model fits data. It helps compare competing models, assess prediction quality, and communicate model performance to non-technical stakeholders.
๐ญ Hint When Stuck
To compute R-squared, first calculate the total sum of squares SS_{\text{tot}} = \sum(y_i - \bar{y})^2 and the residual sum of squares SS_{\text{res}} = \sum(y_i - \hat{y}_i)^2. Then divide: R^2 = 1 - SS_{\text{res}} / SS_{\text{tot}}. A value near 1 means the model captures most variation; near 0 means it explains very little.
Formal View
Related Concepts
See Also
Compare With Similar Concepts
๐ง Common Stuck Point
Students think R-squared tells you if the model is correct. A high R-squared can result from overfitting or a spurious relationship โ always check residuals too.
โ ๏ธ Common Mistakes
- Thinking R^2 = 1 is always good (overfitting)
- Comparing R^2 across different datasets
- Confusing with correlation r
Frequently Asked Questions
What is R-Squared (Coefficient of Determination) in Statistics?
R-squared (the coefficient of determination) is the proportion of variance in the dependent variable that is explained by the independent variable(s) in a regression model. It ranges from 0 to 1, where 0 means the model explains none of the variability and 1 means it explains all of it.
When do you use R-Squared (Coefficient of Determination)?
To compute R-squared, first calculate the total sum of squares SS_{\text{tot}} = \sum(y_i - \bar{y})^2 and the residual sum of squares SS_{\text{res}} = \sum(y_i - \hat{y}_i)^2. Then divide: R^2 = 1 - SS_{\text{res}} / SS_{\text{tot}}. A value near 1 means the model captures most variation; near 0 means it explains very little.
What do students usually get wrong about R-Squared (Coefficient of Determination)?
Students think R-squared tells you if the model is correct. A high R-squared can result from overfitting or a spurious relationship โ always check residuals too.
Prerequisites
How R-Squared (Coefficient of Determination) Connects to Other Ideas
To understand r-squared (coefficient of determination), you should first be comfortable with linear regression and standard deviation intro.