Outliers (Deep) Formula

The Formula

\text{Outlier if } x < Q_1 - 1.5 \times \text{IQR} \text{ or } x > Q_3 + 1.5 \times \text{IQR}

When to use: The weird one that doesn't fit. Is it a mistake, or something interesting?

Quick Example

Incomes: \50K, \55K, \60K, \58K, \5M. The \5M is an outlier.

Notation

Values beyond 1.5 \times \text{IQR} from the quartiles are called outliers; beyond 3 \times \text{IQR} are extreme outliers

What This Formula Means

An outlier is a data value that lies unusually far from most other values, potentially indicating measurement error, a rare event, or an important exception.

The weird one that doesn't fit. Is it a mistake, or something interesting?

Formal View

x is an outlier if x < Q_1 - 1.5 \cdot \text{IQR} or x > Q_3 + 1.5 \cdot \text{IQR} where \text{IQR} = Q_3 - Q_1

Worked Examples

Example 1

medium
Data: \{12, 15, 14, 13, 16, 14, 15, 85\}. Use the 1.5 \times IQR rule to determine if 85 is an outlier, and discuss whether it should be removed.

Solution

  1. 1
    Sort data: \{12, 13, 14, 14, 15, 15, 16, 85\}; n=8
  2. 2
    Q_1 = \frac{13+14}{2} = 13.5; Q_3 = \frac{15+16}{2} = 15.5
  3. 3
    IQR = 15.5 - 13.5 = 2; Upper fence = 15.5 + 1.5(2) = 18.5
  4. 4
    85 > 18.5, so 85 is flagged as an outlier
  5. 5
    Decision: investigate before removing — 85 could be a data entry error (e.g., 15 mis-typed as 85) or a genuine extreme value (e.g., a special event)

Answer

85 is an outlier (exceeds fence of 18.5). Investigate cause before removing.
The 1.5×IQR rule identifies potential outliers but does not determine whether to remove them. Outliers might be data errors (should remove), legitimate rare events (keep), or indicators of a different subgroup (analyze separately).

Example 2

hard
Calculate the effect of an outlier (value 200) on the mean and median for \{10, 12, 11, 13, 12, 200\}, comparing to the data without the outlier \{10, 12, 11, 13, 12\}.

Common Mistakes

  • Automatically deleting outliers without investigating why they exist — they may reveal important information
  • Using only the range to detect outliers instead of the 1.5 \times \text{IQR} rule or z-scores
  • Assuming outliers are always errors — an unusually high income in a data set may be legitimate

Why This Formula Matters

Outliers can dramatically affect mean and SD; deciding what to do with them is crucial.

Frequently Asked Questions

What is the Outliers (Deep) formula?

An outlier is a data value that lies unusually far from most other values, potentially indicating measurement error, a rare event, or an important exception.

How do you use the Outliers (Deep) formula?

The weird one that doesn't fit. Is it a mistake, or something interesting?

What do the symbols mean in the Outliers (Deep) formula?

Values beyond 1.5 \times \text{IQR} from the quartiles are called outliers; beyond 3 \times \text{IQR} are extreme outliers

Why is the Outliers (Deep) formula important in Math?

Outliers can dramatically affect mean and SD; deciding what to do with them is crucial.

What do students get wrong about Outliers (Deep)?

Don't automatically remove outliers—first ask WHY they're there.

What should I learn before the Outliers (Deep) formula?

Before studying the Outliers (Deep) formula, you should understand: variability, interquartile range.