Coefficient of Determination Formula
Coefficient of determination is the proportion of the total variation in the response variable y that is explained by the linear relationship with the.
The Formula
When to use: Total variation in has two parts: what the regression line explains and what's left over (residual variation). If , the regression line accounts for of why values differ from each other, and is unexplained. Think of as a report card for how well predicts .
Quick Example
Notation
What This Formula Means
The proportion of the total variation in the response variable that is explained by the linear relationship with the explanatory variable . It equals the square of the correlation coefficient: .
Total variation in has two parts: what the regression line explains and what's left over (residual variation). If , the regression line accounts for of why values differ from each other, and is unexplained. Think of as a report card for how well predicts .
Formal View
Worked Examples
Example 1
mediumAnswer
First step
See the full worked solution + why-it-works coaching
SetupKey insightWhy it worksCommon pitfallConnection
Example 2
hardExample 3
mediumCommon Mistakes
- Reporting when the question asks for - square the correlation; gives , not 0.7.
- Reading as causation - it measures explained variation, never that causes .
- Letting go negative or above 1 - it's a proportion between 0 and 1, so any value outside that range is an error.
Why This Formula Matters
is the standard one-number report card for a regression's predictive usefulness, and squaring exposes how much weaker a 'decent' correlation really is ( explains only 49%). Mixing it up with or with causation is what leads people to overstate how much a model actually tells them. Recognizing it by "Am I reporting the fraction of 's variation explained by the linear model (a 0-to-1 number), not the slope or the correlation's sign?" โ rather than by familiar numbers โ is what lets a student tell it apart from correlation and slope and residual variation in a mixed problem set.
Frequently Asked Questions
What is the Coefficient of Determination formula?
The proportion of the total variation in the response variable that is explained by the linear relationship with the explanatory variable . It equals the square of the correlation coefficient: .
How do you use the Coefficient of Determination formula?
Total variation in has two parts: what the regression line explains and what's left over (residual variation). If , the regression line accounts for of why values differ from each other, and is unexplained. Think of as a report card for how well predicts .
What do the symbols mean in the Coefficient of Determination formula?
ranges from 0 to 1. = total sum of squares. = residual sum of squares.
Why is the Coefficient of Determination formula important in Math?
is the standard one-number report card for a regression's predictive usefulness, and squaring exposes how much weaker a 'decent' correlation really is ( explains only 49%). Mixing it up with or with causation is what leads people to overstate how much a model actually tells them. Recognizing it by "Am I reporting the fraction of 's variation explained by the linear model (a 0-to-1 number), not the slope or the correlation's sign?" โ rather than by familiar numbers โ is what lets a student tell it apart from correlation and slope and residual variation in a mixed problem set.
What do students get wrong about Coefficient of Determination?
The procedure for coefficient of determination is the easy part; the trap is reporting when the question asks for . Asking "Am I reporting the fraction of 's variation explained by the linear model (a 0-to-1 number), not the slope or the correlation's sign?" first is what keeps a correct-looking calculation from being attached to the wrong concept.
What should I learn before the Coefficient of Determination formula?
Before studying the Coefficient of Determination formula, you should understand: correlation, linear regression lsrl, residuals.