Least Squares Regression Line Math Example 4

Follow the full solution, then compare it with the other examples linked below.

Example 4

hard
The LSRL has the property of minimizing โˆ‘ei2=โˆ‘(yiโˆ’y^i)2\sum e_i^2 = \sum (y_i - \hat{y}_i)^2. Explain why minimizing squared residuals (rather than absolute residuals) is preferred, and name two consequences of this choice.

Solution

  1. 1
    Why squared: (1) mathematically convenient โ€” differentiable, unique closed-form solution; (2) penalizes larger errors more heavily than small ones, making the line more sensitive to influential points
  2. 2
    Consequence 1: the LSRL is sensitive to outliers โ€” a single point far from the line (large residual) has a disproportionate influence on slope and intercept
  3. 3
    Consequence 2: the LSRL passes through (xห‰,yห‰)(\bar{x}, \bar{y}) always โ€” a result that follows mathematically from minimizing squared residuals

Answer

Squared residuals: mathematical tractability + outlier sensitivity. LSRL always passes through (xห‰,yห‰)(\bar{x},\bar{y}).
The choice of squared vs. absolute residuals has important implications. Squared residuals give a unique, computationally convenient solution but create sensitivity to outliers. Absolute residuals give a more robust line (less outlier-sensitive) but lack a closed-form solution.

About Least Squares Regression Line

The unique straight line y^=a+bx\hat{y} = a + bx that minimizes the sum of squared vertical distances (residuals) between the observed data points and the line.

Learn more about Least Squares Regression Line โ†’

More Least Squares Regression Line Examples