Prediction Math Example 2
Follow the full solution, then compare it with the other examples linked below.
Example 2
hardA model predicts house prices. In-sample , but out-of-sample . Explain what this means and identify the problem with the model.
Solution
- 1 In-sample : model explains 92% of price variation for the training data โ appears excellent
- 2 Out-of-sample : model explains only 45% of variation for new data โ poor generalization
- 3 Problem: overfitting โ the model learned the specific training data's quirks/noise rather than the true underlying pattern
- 4 An overfit model performs well on training data but poorly on new predictions โ useless for actual prediction purposes
Answer
Overfitting: model memorized training data (Rยฒ=0.92) but fails on new data (Rยฒ=0.45).
Prediction quality must be evaluated on out-of-sample (held-out) data. Excellent in-sample performance with poor out-of-sample performance is the hallmark of overfitting. The true measure of a predictive model is how well it predicts new, unseen observations.
About Prediction
A prediction is a model-based estimate of an unknown or future value, accompanied by a measure of confidence or uncertainty.
Learn more about Prediction โMore Prediction Examples
Example 1 medium
A linear regression model gives [formula] where [formula] = hours studied and [formula] = test score
Example 3 easyUsing the model [formula], predict [formula] when [formula] and [formula]. Then find [formula] when
Example 4 hardWhy is extrapolation (predicting outside the observed range) dangerous? Give an example where extrap