Inference for Regression Formula

Inference for regression is using hypothesis tests and confidence intervals to draw conclusions about the true population slope _1 of the linear.

The Formula

t=bโˆ’ฮฒ1,0SEbwhereSEb=sโˆ‘(xiโˆ’xห‰)2t = \frac{b - \beta_{1,0}}{\text{SE}_b} \quad\text{where}\quad \text{SE}_b = \frac{s}{\sqrt{\sum(x_i - \bar{x})^2}}

When to use: You computed a sample regression line with slope b=2.3b = 2.3. But is the true population slope actually different from zero? Maybe there's really no linear relationship and you just got a slope by chance. The regression t-test asks: 'Is my sample slope far enough from zero that it's unlikely to have occurred by random variation alone?'

Quick Example

Sample slope b=2.3b = 2.3, SEb=0.8\text{SE}_b = 0.8, n=25n = 25. t=bโˆ’0SEb=2.30.8=2.875(df=23)t = \frac{b - 0}{\text{SE}_b} = \frac{2.3}{0.8} = 2.875 \quad (df = 23) The p-value โ‰ˆ0.008<0.05\approx 0.008 < 0.05, so reject H0:ฮฒ1=0H_0: \beta_1 = 0. There is evidence of a linear relationship.

Notation

bb = sample slope, ฮฒ1\beta_1 = population slope, SEb\text{SE}_b = standard error of the slope, ss = standard deviation of residuals, df=nโˆ’2df = n - 2.

What This Formula Means

Using hypothesis tests and confidence intervals to draw conclusions about the true population slope ฮฒ1\beta_1 of the linear relationship y=ฮฒ0+ฮฒ1x+ฮตy = \beta_0 + \beta_1 x + \varepsilon, based on sample data.

You computed a sample regression line with slope b=2.3b = 2.3. But is the true population slope actually different from zero? Maybe there's really no linear relationship and you just got a slope by chance. The regression t-test asks: 'Is my sample slope far enough from zero that it's unlikely to have occurred by random variation alone?'

Formal View

t=bโˆ’ฮฒ1,0SEbt = \frac{b - \beta_{1,0}}{\text{SE}_b} with df=nโˆ’2df = n - 2 where SEb=sโˆ‘(xiโˆ’xห‰)2\text{SE}_b = \frac{s}{\sqrt{\sum(x_i - \bar{x})^2}}; CI: bยฑtโˆ—โ‹…SEbb \pm t^* \cdot \text{SE}_b

Worked Examples

Example 1

medium
A regression output shows: slope b=2.5b=2.5, SEb=0.8SE_b=0.8, n=30n=30. Test H0:ฮฒ=0H_0: \beta=0 vs Ha:ฮฒโ‰ 0H_a: \beta \neq 0 at ฮฑ=0.05\alpha=0.05 using a t-test.

Answer

t=3.125>2.048t=3.125 > 2.048. Reject H0H_0. The slope is statistically significant at ฮฑ=0.05\alpha=0.05.

First step

1
Test statistic: t=bโˆ’ฮฒ0SEb=2.5โˆ’00.8=3.125t = \frac{b - \beta_0}{SE_b} = \frac{2.5 - 0}{0.8} = 3.125

See the full worked solution + why-it-works coaching

SetupKey insightWhy it worksCommon pitfallConnection

Unlock answer keys One Family plan โ€” every worked solution, all subjects

Example 2

hard
Construct a 95% confidence interval for the slope ฮฒ\beta given: b=1.8b=1.8, SEb=0.5SE_b=0.5, n=25n=25, and t0.025,23โˆ—=2.069t^*_{0.025,23}=2.069.

Example 3

medium
A study gives b=3.2b = 3.2, SEb=1.0SE_b = 1.0, n=12n = 12. With t0.025,10โˆ—=2.228t^*_{0.025, 10} = 2.228, construct the 95% CI for the slope.

Common Mistakes

  • Treating a nonzero sample slope as proof of a population relationship - test bb against its standard error before concluding ฮฒ1โ‰ 0\beta_1\neq 0.
  • Using the wrong degrees of freedom - regression inference uses df=nโˆ’2df=n-2, not nโˆ’1n-1.
  • Forgetting the conditions (linearity, independence, equal spread, normal residuals) - the t-test is only valid when the regression assumptions hold.

Why This Formula Matters

A nonzero sample slope can appear from pure noise even when no real relationship exists, so describing the line isn't enough โ€” you need a test that separates a genuine trend from random scatter. This is the step that lets you say 'there IS a linear relationship in the population,' which a single fitted line can never claim on its own. Recognizing it by "Am I testing whether the underlying population slope is nonzero (rather than just computing or describing the sample slope)?" โ€” rather than by familiar numbers โ€” is what lets a student tell it apart from lsrl and correlation test and two-sample t-test in a mixed problem set.

Frequently Asked Questions

What is the Inference for Regression formula?

Using hypothesis tests and confidence intervals to draw conclusions about the true population slope ฮฒ1\beta_1 of the linear relationship y=ฮฒ0+ฮฒ1x+ฮตy = \beta_0 + \beta_1 x + \varepsilon, based on sample data.

How do you use the Inference for Regression formula?

You computed a sample regression line with slope b=2.3b = 2.3. But is the true population slope actually different from zero? Maybe there's really no linear relationship and you just got a slope by chance. The regression t-test asks: 'Is my sample slope far enough from zero that it's unlikely to have occurred by random variation alone?'

What do the symbols mean in the Inference for Regression formula?

bb = sample slope, ฮฒ1\beta_1 = population slope, SEb\text{SE}_b = standard error of the slope, ss = standard deviation of residuals, df=nโˆ’2df = n - 2.

Why is the Inference for Regression formula important in Math?

A nonzero sample slope can appear from pure noise even when no real relationship exists, so describing the line isn't enough โ€” you need a test that separates a genuine trend from random scatter. This is the step that lets you say 'there IS a linear relationship in the population,' which a single fitted line can never claim on its own. Recognizing it by "Am I testing whether the underlying population slope is nonzero (rather than just computing or describing the sample slope)?" โ€” rather than by familiar numbers โ€” is what lets a student tell it apart from lsrl and correlation test and two-sample t-test in a mixed problem set.

What do students get wrong about Inference for Regression?

The procedure for inference for regression is the easy part; the trap is treating a nonzero sample slope as proof of a population relationship. Asking "Am I testing whether the underlying population slope is nonzero (rather than just computing or describing the sample slope)?" first is what keeps a correct-looking calculation from being attached to the wrong concept.

What should I learn before the Inference for Regression formula?

Before studying the Inference for Regression formula, you should understand: linear regression lsrl, residuals, r squared, hypothesis testing, confidence interval.