Residuals

Statistics
definition

Also known as: residual, prediction error

Grade 9-12

View on concept map

The difference between an observed value and its predicted value from a regression model: \text{residual} = y - \hat{y} (observed minus predicted). Residuals are how you check whether your model is appropriate.

This concept is covered in depth in our data analysis and residuals tutorial, with worked examples, practice problems, and common mistakes.

Definition

The difference between an observed value and its predicted value from a regression model: \text{residual} = y - \hat{y} (observed minus predicted).

๐Ÿ’ก Intuition

A residual is how much the model got wrong for a specific data point. Positive residual means the actual value was higher than predicted; negative means it was lower. If you plot all residuals, the pattern (or lack thereof) tells you whether the model is appropriate.

๐ŸŽฏ Core Idea

A residual plot (residuals vs predicted values or vs x) is the diagnostic tool for regression. Random scatter = good model. Curved pattern = linear model is wrong. Fan shape = non-constant variance.

Example

Regression predicts a student who studies 5 hours will score \hat{y} = 76. Actual score is y = 82. \text{Residual} = 82 - 76 = +6 The model underpredicted by 6 points.

Formula

e_i = y_i - \hat{y}_i

Notation

e_i is the residual for the i-th observation. The sum of all residuals from a LSRL is always zero: \sum e_i = 0.

๐ŸŒŸ Why It Matters

Residuals are how you check whether your model is appropriate. The regression equation alone doesn't tell you if the model fits wellโ€”the residual plot does.

Formal View

e_i = y_i - \hat{y}_i where \hat{y}_i = a + bx_i; for LSRL, \sum_{i=1}^{n} e_i = 0 and \sum_{i=1}^{n} x_i e_i = 0

๐Ÿšง Common Stuck Point

Students compute residuals correctly but don't know how to read residual plots. The key: look for patterns. No pattern = good. Any systematic pattern = problem.

โš ๏ธ Common Mistakes

  • Computing residuals as \hat{y} - y instead of y - \hat{y}โ€”the convention is observed minus predicted.
  • Ignoring the residual plot and only looking at r^2โ€”a high r^2 can still come with a terrible model if the relationship is curved.
  • Expecting residuals to all be close to zeroโ€”some large residuals are normal; look for patterns, not individual values.

Frequently Asked Questions

What is Residuals in Math?

The difference between an observed value and its predicted value from a regression model: \text{residual} = y - \hat{y} (observed minus predicted).

Why is Residuals important?

Residuals are how you check whether your model is appropriate. The regression equation alone doesn't tell you if the model fits wellโ€”the residual plot does.

What do students usually get wrong about Residuals?

Students compute residuals correctly but don't know how to read residual plots. The key: look for patterns. No pattern = good. Any systematic pattern = problem.

What should I learn before Residuals?

Before studying Residuals, you should understand: linear regression lsrl.

How Residuals Connects to Other Ideas

To understand residuals, you should first be comfortable with linear regression lsrl. Once you have a solid grasp of residuals, you can move on to r squared and regression inference.

Want the Full Guide?

This concept is explained step by step in our complete guide:

Data Representation, Variability, and Sampling Guide โ†’