Line of Best Fit Examples in Statistics

Start with the recap, study the fully worked examples, then use the practice problems to check your understanding of Line of Best Fit.

This page combines explanation, solved examples, and follow-up practice so you can move from recognition to confident problem-solving in Statistics.

Concept Recap

The line of best fit (trend line) is the straight line that best represents the overall trend in a scatter plot by minimizing the sum of squared vertical distances between the line and all data points. Its equation enables predictions for new x-values.

If you stretched a rubber band through a scatter plot to be as close to all points as possible, that's the line of best fit. It captures the overall trend.

Read the full concept explanation โ†’

How to Use These Examples

  • Read the first worked example with the solution open so the structure is clear.
  • Try the practice problems before revealing each solution.
  • Use the related concepts and background knowledge badges if you feel stuck.

What to Focus On

Core idea: Line of Best Fit asks whether the same cases connect two variables or groups in a pattern that can be described carefully.

Common stuck point: Students often know a procedure related to line of best fit but skip the recognition step: Am I studying a relationship between variables, and have I separated association from causation? That leads to a calculation or graph that looks reasonable but answers a different question.

Sense of Study hint: Ask: Am I studying a relationship between variables, and have I separated association from causation?

Worked Examples

Example 1

medium
Mean point is (xห‰,yห‰)=(4,11)(\bar{x},\bar{y}) = (4, 11) and slope is โˆ’2-2. Write the line of best fit.

Answer

y^=โˆ’2x+19\hat{y} = -2x + 19

First step

1
Use y^=mx+b\hat{y} = mx + b and the mean point.

See the full worked solution + why-it-works coaching

SetupKey insightWhy it worksCommon pitfallConnection

Unlock answer keys One Family plan โ€” every worked solution, all subjects

Example 2

medium
Trend line for study hours xx vs. test score y^\hat{y} is y^=8x+50\hat{y} = 8x + 50. Interpret the slope in context.

Example 3

medium
A scatter plot of weight (lb) vs. age (yr) for puppies gives y^=5x+2\hat{y} = 5x + 2. Predict the weight at age 66 years and explain whether you should trust the prediction.

Example 4

hard
Data: (1,2),(2,3),(3,5),(4,4),(5,6)(1,2),(2,3),(3,5),(4,4),(5,6). Use xห‰=3,yห‰=4\bar{x}=3,\bar{y}=4 and slope formula b=โˆ‘(xiโˆ’xห‰)(yiโˆ’yห‰)โˆ‘(xiโˆ’xห‰)2b = \frac{\sum(x_i-\bar{x})(y_i-\bar{y})}{\sum(x_i-\bar{x})^2} to find the line.

Example 5

easy
A scatter plot shows the relationship between hours studied (x) and test score (y). The data points generally trend upward. A line of best fit has equation y=5x+40y = 5x + 40. (a) Interpret the slope. (b) Predict the score for a student who studies 8 hours.

Example 6

medium
Given five data points: (1,3), (2,5), (3,6), (4,8), (5,11). Estimate the line of best fit by finding the slope using the first and last points, then adjust to pass through the centroid (xห‰,yห‰)(\bar{x}, \bar{y}).

Practice Problems

Try these problems on your own first, then open the solution to compare your method.

Example 1

easy
A line of best fit has equation y^=2x+5\hat{y} = 2x + 5. Predict y^\hat{y} when x=3x = 3.

Example 2

easy
A line of best fit is y^=โˆ’x+10\hat{y} = -x + 10. Find the predicted value at x=4x = 4.

Example 3

easy
The line of best fit y^=3x+1\hat{y} = 3x + 1 has what slope?

Example 4

easy
The line of best fit y^=4xโˆ’7\hat{y} = 4x - 7 has what y-intercept?

Example 5

easy
A trend line is y^=0.5x+2\hat{y} = 0.5x + 2. Predict y^\hat{y} at x=10x = 10.

Example 6

easy
As x increases, the line of best fit y^=2x+3\hat{y} = 2x + 3 predicts y to do what?

Example 7

easy
The line of best fit minimizes what quantity?

Example 8

easy
Trend line y^=โˆ’2x+20\hat{y} = -2x + 20. Predict y^\hat{y} at x=5x = 5.

Example 9

medium
Two points on a line of best fit are (0,4)(0, 4) and (2,10)(2, 10). Find its equation.

Example 10

medium
A line of best fit passes through the mean point (xห‰,yห‰)=(5,12)(\bar{x}, \bar{y}) = (5, 12) with slope 22. Find its equation.

Example 11

medium
Trend line y^=1.5x+4\hat{y} = 1.5x + 4. By how much does y^\hat{y} change when x increases by 6?

Example 12

medium
A line of best fit predicts y^=24\hat{y}=24 at x=8x=8 and y^=30\hat{y}=30 at x=12x=12. Find the slope.

Example 13

medium
The line of best fit is y^=2x+5\hat{y}=2x+5. At x=4x=4 the observed value is 15. Is the prediction an over- or under-estimate?

Example 14

medium
A trend line y^=โˆ’3x+40\hat{y}=-3x+40 models temperature drop. Predict y^\hat{y} at x=7x=7 and state the trend direction.

Example 15

medium
Why is fitting a line of best fit inappropriate for a clearly U-shaped (curved) scatter plot?

Example 16

medium
Line of best fit y^=5x+3\hat{y}=5x+3 is built from data with x ranging 1 to 10. Why is predicting at x=50x=50 risky?

Example 17

medium
A line of best fit y^=2x+1\hat{y}=2x+1 is fitted, but one extreme outlier at (10,100)(10, 100) pulls the line up. What problem does this illustrate?

Example 18

challenge
A line of best fit passes through (xห‰,yห‰)=(4,10)(\bar{x},\bar{y})=(4,10) and has slope b=rsysxb = r\frac{s_y}{s_x} with r=0.8r=0.8, sy=6s_y=6, sx=4s_x=4. Find the full equation y^=bx+a\hat{y}=bx+a.

Example 19

challenge
Two lines are proposed for data (1,2),(2,2),(3,5)(1,2),(2,2),(3,5): line A y^=x\hat{y}=x and line B y^=1.5xโˆ’0.5\hat{y}=1.5x-0.5. Which has the smaller sum of squared residuals?

Example 20

challenge
A line of best fit y^=mx+b\hat{y}=mx+b gives y^=14\hat{y}=14 at x=2x=2 and y^=26\hat{y}=26 at x=5x=5. Predict y^\hat{y} at x=8x=8.

Example 21

easy
A trend line is y^=0.8x+4\hat{y} = 0.8x + 4. Predict y^\hat{y} when x=5x = 5.

Example 22

easy
Identify the slope of the line of best fit y^=โˆ’1.5x+12\hat{y} = -1.5x + 12.

Example 23

easy
Trend line y^=6โˆ’0.5x\hat{y} = 6 - 0.5x. Predict y^\hat{y} at x=8x = 8.

Example 24

easy
What is the y-intercept of y^=7xโˆ’15\hat{y} = 7x - 15?

Example 25

easy
Trend line y^=5x\hat{y} = 5x. Predict y^\hat{y} at x=3x = 3.

Example 26

medium
A trend line passes through (1,5)(1, 5) and (6,20)(6, 20). Find its equation.

Example 27

medium
Trend line y^=0.4x+8\hat{y} = 0.4x + 8. By how much does y^\hat{y} change when xx increases by 2525?

Example 28

medium
A trend line gives y^=18\hat{y} = 18 at x=4x = 4 and y^=6\hat{y} = 6 at x=10x = 10. Find the slope.

Example 29

medium
A trend line crosses the y-axis at 1414 and has slope โˆ’3-3. Write the equation.

Example 30

medium
Trend line y^=1.2x+5\hat{y} = 1.2x + 5 models price vs. demand. Which value of xx gives predicted y^=23\hat{y} = 23?

Example 31

medium
Trend line y^=2xโˆ’5\hat{y} = 2x - 5. What is y^\hat{y} at x=0x = 0?

Example 32

hard
A regression line uses summary stats xห‰=10\bar{x}=10, yห‰=40\bar{y}=40, r=0.5r=0.5, sy=12s_y=12, sx=4s_x=4. Find the line of best fit.

Example 33

hard
Why does the least-squares line minimize โˆ‘(yiโˆ’y^i)2\sum (y_i - \hat{y}_i)^2 rather than โˆ‘(yiโˆ’y^i)\sum (y_i - \hat{y}_i)?

Example 34

hard
Adding a single outlier far from the rest can change the line of best fit a lot. What property of least-squares causes this sensitivity?

Example 35

hard
A line of best fit has slope 00 and intercept yห‰\bar{y}. What does this say about the linear relationship between xx and yy?

Example 36

hard
Trend line y^=1.4x+6\hat{y} = 1.4x + 6 was fit on data with xx between 22 and 2020. Should you use it to predict at x=200x = 200? Why or why not?

Example 37

challenge
For the four points (1,1),(2,3),(3,2),(4,4)(1,1),(2,3),(3,2),(4,4), find the line of best fit.

Example 38

medium
A line of best fit for temperature (ยฐC, xx) vs ice-cream sales (units, yy) is y=12xโˆ’50y = 12x - 50. (a) Predict sales when it is 30ยฐC. (b) Below what temperature does the model predict zero or negative sales? (c) Comment on the limitation.

Example 39

hard
Two students draw different lines of best fit through the same scatter plot. Student A's line: y=3x+2y = 3x + 2 (sum of squared residuals = 40). Student B's line: y=2.5x+4y = 2.5x + 4 (sum of squared residuals = 28). Which line is better and why?

Background Knowledge

These ideas may be useful before you work through the harder examples.

stat scatter plot