Robustness Math Example 2

Follow the full solution, then compare it with the other examples linked below.

Example 2

medium
Show that the median is more robust than the mean as a measure of centre when outliers are present. Use the data set {2,3,4,5,100}\{2, 3, 4, 5, 100\}.

Solution

  1. 1
    Mean: 2+3+4+5+1005=1145=22.8\frac{2+3+4+5+100}{5} = \frac{114}{5} = 22.8. The outlier 100 pulls the mean far from the majority of values.
  2. 2
    Median: sort the data (already sorted). Middle value is 44. The outlier has no effect on the median.
  3. 3
    Remove the outlier: Mean =2+3+4+54=3.5= \frac{2+3+4+5}{4} = 3.5; Median =3+42=3.5= \frac{3+4}{2} = 3.5. Without the outlier, both are similar.
  4. 4
    Conclusion: the median is robust to the outlier; the mean is not.

Answer

Median=4 (robust);Mean=22.8 (sensitive to outlier)\text{Median} = 4 \text{ (robust)};\quad \text{Mean} = 22.8 \text{ (sensitive to outlier)}
A robust statistic (or model) does not change drastically when a small portion of the data changes or is corrupted. The median is robust; the mean is not — a key practical distinction in statistics.

About Robustness

The property of a result, algorithm, or model remaining valid or approximately correct even when its assumptions are slightly violated.

Learn more about Robustness →

More Robustness Examples