Least Squares Regression Line Examples in Math

Start with the recap, study the fully worked examples, then use the practice problems to check your understanding of Least Squares Regression Line.

This page combines explanation, solved examples, and follow-up practice so you can move from recognition to confident problem-solving in Math.

Concept Recap

The unique straight line y^=a+bx\hat{y} = a + bx that minimizes the sum of squared vertical distances (residuals) between the observed data points and the line.

You have a scatter plot with points scattered around a general trend. The LSRL is the line that gets as close as possible to all the points simultaneouslyβ€”it's the 'best' straight line through the cloud. 'Best' means it minimizes the total squared prediction error.

Read the full concept explanation β†’

How to Use These Examples

  • Read the first worked example with the solution open so the structure is clear.
  • Try the practice problems before revealing each solution.
  • Use the related concepts and background knowledge badges if you feel stuck.

What to Focus On

Core idea: The least-squares regression line is the unique line y^=a+bx\hat{y}=a+bx minimizing the total squared vertical distance to the data.

Common stuck point: The procedure for least squares regression line is the easy part; the trap is minimizing perpendicular or horizontal distances. Asking "Am I fitting a single straight line to two-variable numeric data by minimizing squared vertical distances?" first is what keeps a correct-looking calculation from being attached to the wrong concept.

Sense of Study hint: Ask: Am I fitting a single straight line to two-variable numeric data by minimizing squared vertical distances?

Worked Examples

Example 1

medium
Find the least-squares regression line for: (x,y)(x,y): (1,2),(2,4),(3,5),(4,4),(5,5)(1,2), (2,4), (3,5), (4,4), (5,5). Use b=rsysxb = r \frac{s_y}{s_x} and a=yΛ‰βˆ’bxΛ‰a = \bar{y} - b\bar{x}.

Answer

y^=2.2+0.60x\hat{y} = 2.2 + 0.60x

First step

1
xΛ‰=3\bar{x} = 3, yΛ‰=4\bar{y} = 4; sx=2.5β‰ˆ1.58s_x = \sqrt{2.5} \approx 1.58; sy=1.5β‰ˆ1.22s_y = \sqrt{1.5} \approx 1.22

See the full worked solution + why-it-works coaching

SetupKey insightWhy it worksCommon pitfallConnection

Unlock answer keys One Family plan β€” every worked solution, all subjects

Example 2

hard
The LSRL for predicting weight (yy, kg) from height (xx, cm) is y^=βˆ’100+0.8x\hat{y} = -100 + 0.8x. Interpret the slope and intercept, predict weight for height=175 cm, and explain why extrapolating to height=50 cm is problematic.

Example 3

medium
Given xˉ=4\bar{x}=4, yˉ=20\bar{y}=20, r=0.8r=0.8, sx=2s_x=2, sy=5s_y=5, find the LSRL.

Example 4

medium
Why is predicting yy at an xx-value far outside the observed range dangerous? Give one example.

Example 5

hard
If yy is shifted by adding 55 to each observation, how does the LSRL change?

Practice Problems

Try these problems on your own first, then open the solution to compare your method.

Example 1

easy
Given y^=5+3x\hat{y} = 5 + 3x: (a) predict yy when x=4x=4, (b) interpret the slope, (c) does the line pass through the origin?

Example 2

hard
The LSRL has the property of minimizing βˆ‘ei2=βˆ‘(yiβˆ’y^i)2\sum e_i^2 = \sum (y_i - \hat{y}_i)^2. Explain why minimizing squared residuals (rather than absolute residuals) is preferred, and name two consequences of this choice.

Example 3

easy
A regression line is y^=3+2x\hat{y} = 3 + 2x. Predict yy when x=4x = 4.

Example 4

easy
In y^=5+3x\hat{y} = 5 + 3x, what is the slope?

Example 5

easy
In y^=7βˆ’2x\hat{y} = 7 - 2x, what is the y-intercept?

Example 6

easy
The LSRL minimizes the sum of squared what?

Example 7

easy
Compute the slope: r=0.8r = 0.8, sy=4s_y = 4, sx=2s_x = 2. Use b=rsysxb = r \frac{s_y}{s_x}.

Example 8

easy
A line passes through the point of averages (xΛ‰,yΛ‰)=(5,12)(\bar{x}, \bar{y}) = (5, 12) with slope b=2b = 2. Find the intercept using a=yΛ‰βˆ’bxΛ‰a = \bar{y} - b\bar{x}.

Example 9

easy
In y^=10+4x\hat{y} = 10 + 4x where yy is cost in dollars and xx is hours, interpret the slope.

Example 10

easy
Does the regression of yy on xx generally give the same line as the regression of xx on yy?

Example 11

medium
Given r=0.6r = 0.6, sx=5s_x = 5, sy=10s_y = 10, xˉ=20\bar{x} = 20, yˉ=50\bar{y} = 50, find the LSRL.

Example 12

medium
Using y^=26+1.2x\hat{y} = 26 + 1.2x, predict yy at x=30x = 30.

Example 13

medium
Two data points lie exactly on y^=2+3x\hat{y} = 2 + 3x: (1,?)(1, ?) and (4,?)(4, ?). Find both predicted values.

Example 14

medium
A model y^=100+5x\hat{y} = 100 + 5x predicts plant height (cm) from days xx. Why is predicting height at x=10,000x = 10{,}000 days unwise?

Example 15

medium
Find the slope of the LSRL through summary stats r=βˆ’0.5r = -0.5, sy=8s_y = 8, sx=4s_x = 4.

Example 16

medium
In y^=200βˆ’0.5x\hat{y} = 200 - 0.5x, yy is weight (lb) and xx is age in days for a dieting program. Interpret the intercept and say whether it is meaningful.

Example 17

medium
A slope is computed as b=rsysxb = r \frac{s_y}{s_x} with b=3b = 3 and sysx=5\frac{s_y}{s_x} = 5. Find rr.

Example 18

medium
The LSRL is y^=4+2x\hat{y} = 4 + 2x. A new point (3,12)(3, 12) is observed. Is the observed value above or below the line?

Example 19

medium
A line passes through (xˉ,yˉ)=(8,30)(\bar{x},\bar{y}) = (8, 30) with slope b=2.5b = 2.5. Find its equation.

Example 20

challenge
A dataset has xˉ=10,yˉ=50,sx=2,sy=6,r=0.9\bar{x}=10, \bar{y}=50, s_x=2, s_y=6, r=0.9. Find the LSRL and predict yy at x=12x=12.

Example 21

challenge
Show that if r=1r = 1, the regression of yy on xx and of xx on yy describe the same line. (Use slopes byx=rsysxb_{yx} = r\frac{s_y}{s_x} and bxy=rsxsyb_{xy} = r\frac{s_x}{s_y}.)

Example 22

challenge
A regression on temperature (xx, in ∘^\circC) gives y^=2+0.5x\hat{y} = 2 + 0.5x. If temperature is re-expressed in tenths of a degree (xβ€²=10xx' = 10x), what is the new slope?

Example 23

easy
Given y^=8+1.5x\hat{y} = 8 + 1.5x, predict yy when x=6x = 6.

Example 24

easy
Given r=0.5r = 0.5, sy=6s_y = 6, sx=3s_x = 3, compute the slope of the LSRL.

Example 25

easy
For LSRL y^=100βˆ’2x\hat{y} = 100 - 2x predicting weight (lb) from days xx in a diet program, interpret the slope.

Example 26

easy
Residual = observed βˆ’- predicted. If yobs=15y_{\text{obs}}=15 and y^=12\hat{y}=12, find the residual.

Example 27

medium
A regression y^=50βˆ’0.2x\hat{y} = 50 - 0.2x predicts time-on-test (min) from age xx (yr). Predict for x=25x=25 and compute residual if observed time is 4242.

Example 28

medium
Given the LSRL has slope b=0.6b = 0.6 and passes through (xˉ,yˉ)=(12,25)(\bar{x}, \bar{y}) = (12, 25), find the equation.

Example 29

medium
In a regression with r=0.9r=0.9, r2r^2 describes what?

Example 30

medium
Compute slope: r=βˆ’0.6r = -0.6, sy=12s_y = 12, sx=4s_x = 4.

Example 31

medium
A residual plot shows a clear curved pattern. What does this indicate?

Example 32

medium
For y^=30+4x\hat{y} = 30 + 4x, predict the change in yy if xx increases by 55.

Example 33

medium
Given five data points (1,3),(2,5),(3,7),(4,8),(5,10)(1,3), (2,5), (3,7), (4,8), (5,10), compute xˉ\bar{x} and yˉ\bar{y}.

Example 34

medium
For the LSRL passing through (xˉ,yˉ)=(3,6.6)(\bar{x},\bar{y}) = (3, 6.6) with slope 1.71.7, write the equation.

Example 35

hard
For y^=20+5x\hat{y} = 20 + 5x in dollars per item (yy) vs. quantity (xx), interpret slope and intercept and judge if intercept is meaningful when x=0x=0.

Example 36

hard
A regression equation rescales xx to xβ€²=x/10x' = x/10. If original LSRL is y^=5+2x\hat{y} = 5 + 2x, what is the new equation in xβ€²x'?

Example 37

hard
For the LSRL y^=4+2x\hat{y}=4+2x, the sum of residuals over all data points is always what?

Example 38

hard
A regression has slope b=3b=3. If yy is rescaled to yβ€²=2yy' = 2y, what is the new slope?

Example 39

hard
For r=0.6r=0.6, what percent of variation in yy is NOT explained by linear regression on xx?

Example 40

challenge
Given that the LSRL of yy on xx has slope byxb_{yx} and the LSRL of xx on yy has slope bxyb_{xy}, show byxβ‹…bxy=r2b_{yx} \cdot b_{xy} = r^2.

Example 41

challenge
A regression line y^=3+2x\hat{y} = 3 + 2x has r2=0.81r^2 = 0.81 and sx=4s_x = 4. Find sys_y.

Background Knowledge

These ideas may be useful before you work through the harder examples.

correlationscatter plotmeanstandard deviation