Line of Best Fit Examples in Statistics

Start with the recap, study the fully worked examples, then use the practice problems to check your understanding of Line of Best Fit.

This page combines explanation, solved examples, and follow-up practice so you can move from recognition to confident problem-solving in Statistics.

Concept Recap

The straight line that best represents the trend in a scatter plot, minimizing the overall distance between the line and all data points.

If you stretched a rubber band through a scatter plot to be as close to all points as possible, that's the line of best fit. It captures the overall trend.

Read the full concept explanation โ†’

How to Use These Examples

  • Read the first worked example with the solution open so the structure is clear.
  • Try the practice problems before revealing each solution.
  • Use the related concepts and background knowledge badges if you feel stuck.

What to Focus On

Core idea: The line of best fit (least-squares line) minimizes the sum of squared vertical distances from each data point to the line, giving the most accurate linear predictions.

Common stuck point: Students draw the line of best fit by eye, often forcing it through too many points rather than balancing points above and below the line.

Worked Examples

Example 1

easy
A scatter plot shows the relationship between hours studied (x) and test score (y). The data points generally trend upward. A line of best fit has equation y = 5x + 40. (a) Interpret the slope. (b) Predict the score for a student who studies 8 hours.

Solution

  1. 1
    Step 1: (a) The slope is 5, meaning for each additional hour studied, the predicted test score increases by 5 points.
  2. 2
    Step 2: (b) Substitute x = 8: y = 5(8) + 40 = 40 + 40 = 80.
  3. 3
    Step 3: The predicted score is 80. This is an interpolation if 8 hours is within the range of the data, or an extrapolation if it is outside the data range.

Answer

(a) Each additional hour of study is associated with a 5-point increase in test score. (b) Predicted score for 8 hours: 80.
The line of best fit summarises the linear relationship between two variables. The slope represents the rate of change, and the y-intercept is the predicted value when x = 0. Predictions are most reliable within the range of observed data (interpolation) and less reliable outside it (extrapolation).

Example 2

medium
Given five data points: (1,3), (2,5), (3,6), (4,8), (5,11). Estimate the line of best fit by finding the slope using the first and last points, then adjust to pass through the centroid (\bar{x}, \bar{y}).

Practice Problems

Try these problems on your own first, then open the solution to compare your method.

Example 1

medium
A line of best fit for temperature (ยฐC, x) vs ice-cream sales (units, y) is y = 12x - 50. (a) Predict sales when it is 30ยฐC. (b) Below what temperature does the model predict zero or negative sales? (c) Comment on the limitation.

Example 2

hard
Two students draw different lines of best fit through the same scatter plot. Student A's line: y = 3x + 2 (sum of squared residuals = 40). Student B's line: y = 2.5x + 4 (sum of squared residuals = 28). Which line is better and why?

Background Knowledge

These ideas may be useful before you work through the harder examples.

scatter plotslope intercept