Coefficient of Determination: Classroom Guide, Examples & Practice

Q: What is the fastest recognition cue for Coefficient of Determination?

Look for **percent of variation explained**, **proportion explained**, **square of correlation**, **goodness of fit**, but treat those words as clues, not proof. A word problem can contain a familiar keyword and still ask for a different idea. After noticing the cue, ask the recognition question: Am I reporting the fraction of $y$'s variation explained by the linear model (a 0-to-1 number), not the slope or the correlation's sign? That question protects you from using a memorized procedure in the wrong place.

Section 1

Quick Answer

The coefficient of determination $r^2$ is the proportion of total variation in $y$ explained by the linear relationship with $x$ , equal to the square of the correlation $r$ . Use it to report how well the regression line accounts for the spread in $y$ . The cue is the phrase 'percent of variation explained' — a number from 0 to 1, never negative, never a slope. Before calculating, ask: Am I reporting the fraction of $y$ 's variation explained by the linear model (a 0-to-1 number), not the slope or the correlation's sign?

Section 2

Why This Matters

$r^2$ is the standard one-number report card for a regression's predictive usefulness, and squaring $r$ exposes how much weaker a 'decent' correlation really is ( $r=0.7$ explains only 49%). Mixing it up with $r$ or with causation is what leads people to overstate how much a model actually tells them. Recognizing it by "Am I reporting the fraction of $y$ 's variation explained by the linear model (a 0-to-1 number), not the slope or the correlation's sign?" — rather than by familiar numbers — is what lets a student tell it apart from correlation $r$ and slope $b$ and residual variation in a mixed problem set.

Section 3

Intuitive Explanation

A bar showing the total spread of $y$ split into two pieces: the part the line explains and the leftover residual part — $r^2$ is the fraction of the bar that's the explained piece. This is the clean version of the idea because the visible structure matches the concept before any formula or procedure is chosen.

Treating $r^2$ as proof that $x$ causes $y$ , or confusing it with the slope — $r^2$ only says how much variation is explained, with no direction and no causal claim. That contrast matters because many wrong answers come from recognizing a surface feature, such as a familiar number or word, instead of the actual task.

A useful way to slow down is to name the signal words and then test them. Words like **percent of variation explained**, **proportion explained**, **square of correlation**, **goodness of fit**, **between 0 and 1** are helpful clues, but they are not enough by themselves. They must point to the same structure as the mental model: $r^2$ is the proportion of variation in $y$ accounted for by the linear relationship with $x$ — the square of the correlation.

The recognition test is simple: Am I reporting the fraction of $y$ 's variation explained by the linear model (a 0-to-1 number), not the slope or the correlation's sign? If yes, coefficient of determination is probably the right tool; if not, compare with Correlation $r$ or Slope $b$ or Residual variation before calculating.

Core idea

$r^2$ is the proportion of variation in $y$ accounted for by the linear relationship with $x$ — the square of the correlation.

Section 4

When to Use

Use Coefficient of Determination when you need to report what proportion of the variation in $y$ a linear model explains. Strong signals include **percent of variation explained**, **proportion explained**, **square of correlation**, **goodness of fit**, **between 0 and 1**. The safest workflow is to read the final question first, identify what kind of answer it wants, and then test the structure. Do not use coefficient of determination just because familiar numbers appear; first decide whether the situation answers "Am I reporting the fraction of $y$ 's variation explained by the linear model (a 0-to-1 number), not the slope or the correlation's sign?" with yes.

✨ Pro tip

Ask: Am I reporting the fraction of $y$ 's variation explained by the linear model (a 0-to-1 number), not the slope or the correlation's sign?

Section 5

How to Recognize It

Before using Coefficient of Determination, check the structure of the problem, not just the vocabulary. These questions force the same recognition move from several angles: the task, the signal words, the nearest confusion, and the thing that would make the concept fail.

Am I reporting the fraction of $y$ 's variation explained by the linear model (a 0-to-1 number), not the slope or the correlation's sign?

If yes, the problem matches coefficient of determination. If no, pause before applying the procedure, because the same numbers may belong to a different idea.
Which words signal the structure?

Look for percent of variation explained, proportion explained, square of correlation, goodness of fit. These words are useful only after the situation matches them; a keyword without structure is not proof.
What is the nearest confusion?

Correlation $r$ is the common trap here: Carries the sign and direction of association, ranging $-1$ to $1$ ; $r^2$ drops the sign and squares it. Compare the desired final answer before choosing a method.
What answer form should I expect?

The answer should fit this mental model: $r^2$ is the proportion of variation in $y$ accounted for by the linear relationship with $x$ — the square of the correlation. If the expected answer sounds more like correlation $r$ , use the comparison table before solving.
What would make this NOT Coefficient of Determination?

Treating $r^2$ as proof that $x$ causes $y$ , or confusing it with the slope — $r^2$ only says how much variation is explained, with no direction and no causal claim. This tells you when to switch tools instead of forcing the concept.

Section 6

Coefficient of Determination vs Common Confusions

The hard part is recognizing when the task is really about coefficient of determination instead of a nearby idea. Read the final answer the problem wants, then ask which row describes the structure before you start calculating.

Coefficient of Determination vs Common Confusions
Type	Meaning	Key test	Formula	Example
Coefficient of Determination	Use this when you need to report what proportion of the variation in $y$ a linear model explains. The deciding question is: Am I reporting the fraction of $y$ 's variation explained by the linear model (a 0-to-1 number), not the slope or the correlation's sign?	Am I reporting the fraction of $y$'s variation explained by the linear model (a 0-to-1 number), not the slope or the correlation's sign?	$r^2 = 1 - \frac{\text{SS}_{\text{residual}}}{\text{SS}_{\text{total}}} = 1 - \frac{\sum(y_i - \hat{y}_i)^2}{\sum(y_i - \bar{y})^2}$	A regression of weight on height gives correlation $r=0.75$ . Find and interpret $r^2$ .
Correlation $r$	Carries the sign and direction of association, ranging $-1$ to $1$ ; $r^2$ drops the sign and squares it.	Use $r$ when you need direction (positive/negative) of the relationship.	$r^2=(r)^2$	$r=-0.9$ vs $r^2=0.81$
Slope $b$	The rate $y$ changes per unit $x$ , carrying units; $r^2$ is a unitless fraction of variation.	Use the slope when predicting the change in $y$ per unit $x$.	$b=r\frac{s_y}{s_x}$	0.6 kg per cm
Residual variation	The unexplained leftover, equal to $1-r^2$ of the total.	Use when describing what the model fails to capture.	$1-r^2$	15% unexplained when $r^2=0.85$

Coefficient of Determination

Meaning: Use this when you need to report what proportion of the variation in $y$ a linear model explains. The deciding question is: Am I reporting the fraction of $y$ 's variation explained by the linear model (a 0-to-1 number), not the slope or the correlation's sign?
Key test: Am I reporting the fraction of $y$'s variation explained by the linear model (a 0-to-1 number), not the slope or the correlation's sign?
Formula: $r^2 = 1 - \frac{\text{SS}_{\text{residual}}}{\text{SS}_{\text{total}}} = 1 - \frac{\sum(y_i - \hat{y}_i)^2}{\sum(y_i - \bar{y})^2}$
Example: A regression of weight on height gives correlation $r=0.75$ . Find and interpret $r^2$ .

Correlation $r$

Meaning: Carries the sign and direction of association, ranging $-1$ to $1$ ; $r^2$ drops the sign and squares it.
Key test: Use $r$ when you need direction (positive/negative) of the relationship.
Formula: $r^2=(r)^2$
Example: $r=-0.9$ vs $r^2=0.81$

Slope $b$

Meaning: The rate $y$ changes per unit $x$ , carrying units; $r^2$ is a unitless fraction of variation.
Key test: Use the slope when predicting the change in $y$ per unit $x$.
Formula: $b=r\frac{s_y}{s_x}$
Example: 0.6 kg per cm

Residual variation

Meaning: The unexplained leftover, equal to $1-r^2$ of the total.
Key test: Use when describing what the model fails to capture.
Formula: $1-r^2$
Example: 15% unexplained when $r^2=0.85$

Section 7

Formula & Notation

r^2 = 1 - \frac{\text{SS}_{\text{residual}}}{\text{SS}_{\text{total}}} = 1 - \frac{\sum(y_i - \hat{y}_i)^2}{\sum(y_i - \bar{y})^2}

r^2 = 1 - \frac{\text{SS}_{\text{res}}}{\text{SS}_{\text{tot}}} = 1 - \frac{\sum(y_i - \hat{y}_i)^2}{\sum(y_i - \bar{y})^2}

where

0 \leq r^2 \leq 1

How to read it: $r^2$ ranges from 0 to 1. $\text{SS}_{\text{total}}$ = total sum of squares. $\text{SS}_{\text{residual}}$ = residual sum of squares.

Section 8

Worked Examples

Example 1 — Interpreting a fit

Easy

Problem

A regression of weight on height gives correlation $r=0.75$ . Find and interpret $r^2$ .

Solution

We want the proportion of variation in weight explained by height — square the correlation.

Name the structure before touching arithmetic — that is what makes the right method obvious.
Ask the recognition question: Am I reporting the fraction of $y$ 's variation explained by the linear model (a 0-to-1 number), not the slope or the correlation's sign?

If the answer is yes, the concept applies; the cue, not a keyword, decides the method.
Compute $r^2=(0.75)^2=0.5625$ .

The rule is chosen only after the structure matches, so the steps mean something.
About 56% of the variation in weight is explained by the linear relationship with height; 44% is unexplained.

Keep units, shape, or answer form tied to the story so the work does not become symbol pushing.
Check the answer against the original question.

It should fit the mental model — the fraction of the wiggle the line explains. If it does not, revisit the recognition step before changing the arithmetic.

Answer

$r^2\approx 0.56$

Takeaway: Square the correlation to get the fraction of $y$ 's variation the line explains.

Example 2 — Asking for direction

Standard

Problem

You're told $r^2=0.81$ and asked whether the relationship is positive or negative. Can $r^2$ answer that?

Solution

Notice why this looks like the same concept.

Nearby language or numbers can tempt you toward the fraction of the wiggle the line explains.
Squaring erased the sign — $r^2=0.81$ could come from $r=+0.9$ or $r=-0.9$ .

Spotting what actually changed is what separates this from the concept it resembles.
Go back to $r$ (or the slope's sign) to get direction; $r^2$ alone can't.

The nearby idea may share numbers but answers a different question, so it needs a different move.
State the result in the language of the actual task.

No — $r^2$ can't give direction. Name it for what the problem really asked, not the concept you first expected.
Say the contrast in one sentence.

$r^2$ measures strength of explained variation only; the sign lives in $r$ or the slope.

Answer

No — $r^2$ can't give direction

Takeaway: $r^2$ measures strength of explained variation only; the sign lives in $r$ or the slope.

Example 3 — Spot the trap: The fraction of the wiggle the line explains

Application

Problem

A student starts with this idea: "Reporting $r$ when the question asks for $r^2$ " What should they check before accepting that reasoning?

Solution

Pause before the first move.

The first move is a decision, not a calculation — does the situation really match the fraction of the wiggle the line explains.
Run the recognition test: Am I reporting the fraction of $y$ 's variation explained by the linear model (a 0-to-1 number), not the slope or the correlation's sign?

This is the single check that the trap skips.
square the correlation; $r=0.7$ gives $r^2=0.49$ , not 0.7.

Stating the safer rule turns the mistake into a checkable step instead of a vague "be careful."
Compare with the nearest confusion, Correlation $r$ .

Carries the sign and direction of association, ranging $-1$ to $1$ ; $r^2$ drops the sign and squares it.
State the corrected decision and reuse it.

Using the concept only when the structure matches leaves a process the student can repeat on a new problem.

Answer

square the correlation; $r=0.7$ gives $r^2=0.49$ , not 0.7.

Takeaway: The recognition step prevents the common trap: Reporting $r$ when the question asks for $r^2$

Section 9

Common Mistakes

Common slip-up

Reporting $r$ when the question asks for $r^2$

The right idea

square the correlation; $r=0.7$ gives $r^2=0.49$ , not 0.7.

Common slip-up

Reading $r^2$ as causation

The right idea

it measures explained variation, never that $x$ causes $y$ .

Common slip-up

Letting $r^2$ go negative or above 1

The right idea

it's a proportion between 0 and 1, so any value outside that range is an error.

Section 10

Mini Practice

Try these on your own. Tap Reveal when you want to check.

What clue tells you this is a Coefficient of Determination situation: A regression of weight on height gives correlation $r=0.75$ . Find and interpret $r^2$ .

Hint: Am I reporting the fraction of $y$ 's variation explained by the linear model (a 0-to-1 number), not the slope or the correlation's sign?
A regression of weight on height gives correlation $r=0.75$ . Find and interpret $r^2$ .

Hint: Compute $r^2=(0.75)^2=0.5625$ .
Why is this a contrast case instead of Coefficient of Determination: You're told $r^2=0.81$ and asked whether the relationship is positive or negative. Can $r^2$ answer that?

Hint: Squaring erased the sign — $r^2=0.81$ could come from $r=+0.9$ or $r=-0.9$ .
Fix this thinking: Reporting $r$ when the question asks for $r^2$

Hint: Name the recognition cue before choosing a rule.
Which is the better fit here: Coefficient of Determination or Correlation $r$ ? Explain the deciding difference.

Hint: For Coefficient of Determination, ask: Am I reporting the fraction of $y$ 's variation explained by the linear model (a 0-to-1 number), not the slope or the correlation's sign?
Write one sentence that would remind a classmate how to recognize Coefficient of Determination.

Hint: Use the mental model "The fraction of the wiggle the line explains." and one signal word.

Want the full set?

50 practice questions for this concept — free to try, every one with a complete worked solution showing the why, not just the answer.

Practice this concept → Take a mastery check

Section 11

Frequently Asked Questions

How do I know when to use Coefficient of Determination?

Use Coefficient of Determination when you need to report what proportion of the variation in $y$ a linear model explains. Do not start from the numbers alone; first name the structure of the situation. The fastest check is: Am I reporting the fraction of $y$ 's variation explained by the linear model (a 0-to-1 number), not the slope or the correlation's sign? If the answer is yes and the wording matches cues like percent of variation explained, proportion explained, square of correlation, then coefficient of determination is probably the right tool.

What is Coefficient of Determination most often confused with?

Coefficient of Determination is often confused with Correlation $r$ . Correlation $r$ means Carries the sign and direction of association, ranging $-1$ to $1$ ; $r^2$ drops the sign and squares it. The difference is not just vocabulary; it changes the action you take. For coefficient of determination, the key test is "Am I reporting the fraction of $y$ 's variation explained by the linear model (a 0-to-1 number), not the slope or the correlation's sign?" For correlation $r$ , the better cue is: Use $r$ when you need direction (positive/negative) of the relationship.

What is the fastest recognition cue for Coefficient of Determination?

Look for percent of variation explained, proportion explained, square of correlation, goodness of fit, but treat those words as clues, not proof. A word problem can contain a familiar keyword and still ask for a different idea. After noticing the cue, ask the recognition question: Am I reporting the fraction of $y$ 's variation explained by the linear model (a 0-to-1 number), not the slope or the correlation's sign? That question protects you from using a memorized procedure in the wrong place.

What mistake should I avoid with Coefficient of Determination?

Avoid this thinking: "Reporting $r$ when the question asks for $r^2$ " That mistake usually happens when the student jumps to a rule before checking the situation. The safer version is: square the correlation; $r=0.7$ gives $r^2=0.49$ , not 0.7. A good habit is to say the mental model out loud first: "The fraction of the wiggle the line explains." Then choose the calculation or representation.

How can I tell this apart from Slope

b

?

Slope $b$ is the better fit when the task is about this: The rate $y$ changes per unit $x$ , carrying units; $r^2$ is a unitless fraction of variation. Coefficient of Determination is the better fit when you need to report what proportion of the variation in $y$ a linear model explains. If both ideas seem possible, compare what the problem wants as the final answer. The desired output often reveals whether you should use coefficient of determination or switch to the nearby concept.

Why does Coefficient of Determination matter?

$r^2$ is the standard one-number report card for a regression's predictive usefulness, and squaring $r$ exposes how much weaker a 'decent' correlation really is ( $r=0.7$ explains only 49%). Mixing it up with $r$ or with causation is what leads people to overstate how much a model actually tells them. The practical value is recognition: once you can spot coefficient of determination, you can choose a method before calculating. That makes later topics easier because you are not memorizing isolated tricks; you are recognizing the same structure when it appears in a new representation.

Section 12

Learning Path

← Before

Correlation Least Squares Regression Line Residuals

Coefficient of Determination

You are here

Next →

Inference for Regression

Before this, students should be comfortable with Correlation and Least Squares Regression Line. This page focuses on the recognition cue: Am I reporting the fraction of $y$'s variation explained by the linear model (a 0-to-1 number), not the slope or the correlation's sign? That cue is the bridge between earlier skills and later problem solving: students first learn to identify the structure, then they learn which calculation, diagram, graph, or proof move belongs to it. After this, Inference for Regression become easier to recognize.

Section 13