Outlier Detection Examples in Statistics

Start with the recap, study the fully worked examples, then use the practice problems to check your understanding of Outlier Detection.

This page combines explanation, solved examples, and follow-up practice so you can move from recognition to confident problem-solving in Statistics.

Concept Recap

Outlier detection is the process of identifying data points that are unusually far from the rest of the dataset, using techniques like the IQR rule, z-scores, or visual inspection of box plots and scatter plots. These anomalous values may indicate measurement errors, data entry mistakes, or genuinely extreme observations.

Outliers are data points that don't fit the pattern. A 7-foot student in a class of average heights, or a \10 million house in a neighborhood of \300k homes. They may be errors or genuinely unusual.

Read the full concept explanation โ†’

How to Use These Examples

  • Read the first worked example with the solution open so the structure is clear.
  • Try the practice problems before revealing each solution.
  • Use the related concepts and background knowledge badges if you feel stuck.

What to Focus On

Core idea: Outliers are data points that lie far from the bulk of the data. They should be investigated โ€” they may indicate data errors, special cases, or important extremes.

Common stuck point: Students automatically delete outliers without investigating them. Outliers are sometimes the most informative data points and should not be removed without justification.

Sense of Study hint: To detect outliers, first try the IQR method: compute Q_1 and Q_3, then IQR = Q_3 - Q_1. Any point below Q_1 - 1.5 \times IQR or above Q_3 + 1.5 \times IQR is flagged as an outlier. Alternatively, calculate the z-score for each point; values with |z| > 3 are considered outliers. Always investigate flagged points before removing them.

Common Mistakes to Watch For

Before you work through the examples, skim the mistake guide so you know which shortcuts and sign errors to avoid.

Worked Examples

Example 1

easy
The data set is: 10, 12, 11, 13, 12, 14, 11, 50. Identify the outlier and explain how you know.

Solution

  1. 1
    Step 1: Most values cluster between 10 and 14. The value 50 is far removed from this cluster.
  2. 2
    Step 2: Check with quartiles: Sort: 10,11,11,12,12,13,14,50. Q_1 = 11, Q_3 = 13.5, IQR = 2.5.
  3. 3
    Step 3: Upper fence: Q_3 + 1.5 \times IQR = 13.5 + 3.75 = 17.25. Since 50 > 17.25, it is a confirmed outlier by the 1.5 \times IQR rule.

Answer

The value 50 is an outlier. It exceeds the upper fence of 17.25 (using the 1.5 \times IQR rule).
Outliers are data points that are significantly different from other observations. The 1.5 \times IQR rule provides an objective method for identifying them: any value below Q_1 - 1.5 \times IQR or above Q_3 + 1.5 \times IQR is classified as an outlier.

Example 2

medium
Test scores: 72, 75, 78, 80, 82, 85, 88, 90, 92, 95. A new student's score of 25 is added. How does this outlier affect the mean and median?

Practice Problems

Try these problems on your own first, then open the solution to compare your method.

Example 1

medium
A scientist records reaction times (ms): 245, 260, 255, 270, 250, 980, 265, 258. Use the 1.5 \times IQR rule to determine if 980 is an outlier. Should it be removed from the analysis?

Example 2

hard
A data set has mean \bar{x} = 100 and standard deviation s = 15. Using the z-score method, determine whether the values 60, 145, and 155 are outliers (using the threshold |z| > 2).

Background Knowledge

These ideas may be useful before you work through the harder examples.

stat interquartile rangestat z score