Outliers (Deep) Formula

The Formula

\text{Outlier if } x < Q_1 - 1.5 \times \text{IQR} \text{ or } x > Q_3 + 1.5 \times \text{IQR}

When to use: The weird one that doesn't fit. Is it a mistake, or something interesting?

Quick Example

Incomes: \50K, \55K, \60K, \58K, \5M. The \5M is an outlier.

Notation

Values beyond 1.5 \times \text{IQR} from the quartiles are called outliers; beyond 3 \times \text{IQR} are extreme outliers

What This Formula Means

An outlier is a data value that lies unusually far from most other values, potentially indicating measurement error, a rare event, or an important exception.

The weird one that doesn't fit. Is it a mistake, or something interesting?

Formal View

x is an outlier if x < Q_1 - 1.5 \cdot \text{IQR} or x > Q_3 + 1.5 \cdot \text{IQR} where \text{IQR} = Q_3 - Q_1

Worked Examples

Example 1

medium
Data: \{12, 15, 14, 13, 16, 14, 15, 85\}. Use the 1.5 \times IQR rule to determine if 85 is an outlier, and discuss whether it should be removed.

Solution

  1. 1
    Sort data: \{12, 13, 14, 14, 15, 15, 16, 85\}; n=8
  2. 2
    Q_1 = \frac{13+14}{2} = 13.5; Q_3 = \frac{15+16}{2} = 15.5
  3. 3
    IQR = 15.5 - 13.5 = 2; Upper fence = 15.5 + 1.5(2) = 18.5
  4. 4
    85 > 18.5, so 85 is flagged as an outlier
  5. 5
    Decision: investigate before removing — 85 could be a data entry error (e.g., 15 mis-typed as 85) or a genuine extreme value (e.g., a special event)

Answer

85 is an outlier (exceeds fence of 18.5). Investigate cause before removing.
The 1.5×IQR rule identifies potential outliers but does not determine whether to remove them. Outliers might be data errors (should remove), legitimate rare events (keep), or indicators of a different subgroup (analyze separately).

Example 2

hard
Calculate the effect of an outlier (value 200) on the mean and median for \{10, 12, 11, 13, 12, 200\}, comparing to the data without the outlier \{10, 12, 11, 13, 12\}.

Common Mistakes

  • Automatically deleting outliers without investigating why they exist — they may reveal important information
  • Using only the range to detect outliers instead of the 1.5 \times \text{IQR} rule or z-scores
  • Assuming outliers are always errors — an unusually high income in a data set may be legitimate

Why This Formula Matters

Outliers can dramatically skew the mean, inflate the standard deviation, and distort regression lines — deciding whether to investigate, keep, or remove them is one of the most important judgments in data analysis.

Frequently Asked Questions

What is the Outliers (Deep) formula?

An outlier is a data value that lies unusually far from most other values, potentially indicating measurement error, a rare event, or an important exception.

How do you use the Outliers (Deep) formula?

The weird one that doesn't fit. Is it a mistake, or something interesting?

What do the symbols mean in the Outliers (Deep) formula?

Values beyond 1.5 \times \text{IQR} from the quartiles are called outliers; beyond 3 \times \text{IQR} are extreme outliers

Why is the Outliers (Deep) formula important in Math?

Outliers can dramatically skew the mean, inflate the standard deviation, and distort regression lines — deciding whether to investigate, keep, or remove them is one of the most important judgments in data analysis.

What do students get wrong about Outliers (Deep)?

Don't automatically remove outliers—first ask WHY they're there.

What should I learn before the Outliers (Deep) formula?

Before studying the Outliers (Deep) formula, you should understand: variability, interquartile range.