Underfitting (Intuition) Examples in Math

Start with the recap, study the fully worked examples, then use the practice problems to check your understanding of Underfitting (Intuition).

This page combines explanation, solved examples, and follow-up practice so you can move from recognition to confident problem-solving in Math.

Concept Recap

Underfitting occurs when a model is too simple to capture the true pattern in the data, performing poorly on both training data and new data.

The model misses important structureβ€”it's not learning enough.

Read the full concept explanation β†’

How to Use These Examples

  • Read the first worked example with the solution open so the structure is clear.
  • Try the practice problems before revealing each solution.
  • Use the related concepts and background knowledge badges if you feel stuck.

What to Focus On

Core idea: Underfitting is when a model is so basic it misses the real structure and does poorly everywhere.

Common stuck point: The procedure for underfitting (intuition) is the easy part; the trap is assuming poor accuracy always means overfitting. Asking "Does the model do badly even on the data it was trained on?" first is what keeps a correct-looking calculation from being attached to the wrong concept.

Sense of Study hint: Ask: Does the model do badly even on the data it was trained on?

Worked Examples

Example 1

easy
A scatter plot shows a clear U-shaped relationship between xx and yy. A linear model is fitted and gets R2=0.05R^2 = 0.05. Explain what underfitting is and how to fix this model.

Answer

Underfitting: linear model can't capture U-shape. Fix: use quadratic model to match the true curved relationship.

First step

1
Underfitting: the model is too simple to capture the U-shaped pattern β€” a line cannot represent a curve

Full solution

  1. 2
    R2=0.05R^2 = 0.05: only 5% of variation explained β€” the model is nearly useless
  2. 3
    Diagnosis: both training error and test error are high (model fails everywhere)
  3. 4
    Fix: use a quadratic model y^=ax2+bx+c\hat{y} = ax^2 + bx + c which can capture U-shapes; check that R2R^2 improves significantly
Underfitting (high bias) occurs when the model's functional form is too restrictive for the true relationship. Unlike overfitting (which needs simplification), underfitting requires a more flexible model class to capture the real pattern in the data.

Example 2

medium
Compare two regression models on the same data: Model 1 (linear): training R2=0.40R^2=0.40, test R2=0.38R^2=0.38. Model 2 (cubic): training R2=0.95R^2=0.95, test R2=0.35R^2=0.35. Diagnose each model.

Example 3

medium
A scatter plot shows a clear parabolic relationship. You fit y=a+bxy = a + bx and get R2=0.10R^2 = 0.10. What model would you try next, and why?

Example 4

medium
Compare two models: Model A: train R2=0.30R^2 = 0.30, test R2=0.28R^2 = 0.28. Model B: train R2=0.95R^2 = 0.95, test R2=0.20R^2 = 0.20. Diagnose each.

Example 5

medium
A learning-curve plot shows train and test error very close and both high, flat across sample size. What does this suggest?

Example 6

hard
In a regression task, residuals plotted against xx show a clear pattern (curve). What does that pattern reveal about the current model?

Example 7

hard
You compare three models: linear (Rtest2=0.20R^2_{test} = 0.20), quadratic (0.650.65), cubic (0.660.66). Which model is best, and why?

Example 8

challenge
An analyst applies kk-means with k=2k = 2 to a dataset that obviously has 55 natural clusters, then concludes 'there are only two types of customers.' Critique this conclusion in terms of underfitting.

Practice Problems

Try these problems on your own first, then open the solution to compare your method.

Example 1

easy
A student uses y^=50\hat{y} = 50 (the mean) to predict all exam scores. Training R2=0R^2 = 0. Explain why this represents maximum underfitting.

Example 2

hard
A linear model for time-series stock prices has high residuals for all time periods. Identify this as underfitting and explain why stock prices might inherently have an upper bound on R2R^2 regardless of model complexity.

Example 3

easy
Fitting a straight line to data that clearly follows a curve is an example of ___.

Example 4

easy
Underfitting means the model is too ___ to capture the true pattern.

Example 5

easy
An underfit model performs ___ on both training and new data.

Example 6

easy
Train error 30%, test error 32%. Overfit or underfit?

Example 7

easy
To fix underfitting, should you make the model more or less complex?

Example 8

easy
Using too few predictor variables when important ones exist can cause ___.

Example 9

easy
A simple model fails to find a pattern. Does that prove no pattern exists?

Example 10

easy
Which is underfit: a model with high bias and low variance, or low bias and high variance?

Example 11

medium
A residual plot of a linear fit shows a clear curved pattern. Is the model overfit or underfit, and why?

Example 12

medium
Model A (line) train error 40%, test 41%. Model B (curve) train 5%, test 7%. Which underfits and which to choose?

Example 13

medium
As complexity increases from very low, both train and test error fall together at first. What were they suffering from at low complexity?

Example 14

medium
A linear model gives R2=0.30R^2=0.30 on data that visibly follows a parabola. What does the low R2R^2 plus visible curve suggest?

Example 15

medium
Why might adding a squared term (x2x^2) fix an underfit linear model on curved data?

Example 16

medium
An underfit and an overfit model both have test error 20%, but train errors 19% and 1%. Which is which?

Example 17

medium
A model uses only 'day of week' to predict sales but ignores price and season. Likely underfit or overfit?

Example 18

medium
A model predicting house prices uses only 'number of windows' and errs badly on both training and test data. Underfit or overfit, and what is the likely fix?

Example 19

medium
A straight-line model on clearly exponential growth has high error everywhere. What change to the model would most likely reduce underfitting?

Example 20

challenge
Total error =bias2+variance=\text{bias}^2+\text{variance}. A model has bias 6, variance 1 (so error 37); adding capacity changes it to bias 2, variance 5. Did total error improve, and was the original underfit?

Example 21

challenge
A constant model y^=yˉ=5\hat{y}=\bar{y}=5 is fit to data {2,5,8}\{2,5,8\}. Compute SSR and explain why this extreme model underfits.

Example 22

challenge
Errors by complexity β€” train: {1:50,2:20,3:8,4:7}\{1:50,2:20,3:8,4:7\}, test: {1:51,2:22,3:10,4:9}\{1:51,2:22,3:10,4:9\}. At which complexity does underfitting stop being the main problem?

Example 23

easy
A model gets R2=0.08R^2 = 0.08 on training data and R2=0.06R^2 = 0.06 on test data. Underfit or overfit?

Example 24

easy
True or false: an underfit model usually has low training error.

Example 25

easy
Predicting the mean y^=yˉ\hat y = \bar y for every input gives training R2=0R^2 = 0. What kind of fit is this?

Example 26

easy
Train accuracy 55%55\%, test accuracy 54%54\%. Best diagnosis?

Example 27

medium
In the bias-variance decomposition, underfitting corresponds primarily to high ___.

Example 28

medium
You apply heavy regularization (Ξ»=106\lambda = 10^6) to a linear model. Most coefficients shrink to nearly zero. What happens to bias and variance?

Example 29

medium
A linear classifier achieves 50%50\% accuracy on a 2D dataset that is clearly arranged in concentric circles. Underfit or overfit?

Example 30

medium
List two practical remedies for underfitting.

Example 31

medium
True or false: an underfit model has low variance across different training sets.

Example 32

medium
A decision tree limited to depth 1 (a 'stump') is fit to a complex dataset. Predicted accuracy β‰ˆ55%\approx 55\%. Cause?

Example 33

hard
A neural network is trained for only 2 epochs. Both train and validation loss remain high. Best next action?

Example 34

hard
A medical model uses only patient age to predict disease risk and achieves R2=0.12R^2 = 0.12. Adding 10 more relevant predictors raises R2R^2 to 0.550.55. What was the original problem?

Example 35

hard
A logistic regression with two predictors yields chance-level 50%50\% accuracy. You suspect the boundary is non-linear. Suggest one practical change.

Example 36

hard
True or false: an underfit model's predictions have high variance across re-samples of the training data.

Example 37

hard
In an ideal bias-variance plot vs model complexity, where on the curve are underfit models located?

Example 38

hard
Why might an underfit model still be preferred for some applications, despite its low accuracy?

Related Concepts

Background Knowledge

These ideas may be useful before you work through the harder examples.

model fit intuition