Chi-Square Test Formula

Chi-square test is a hypothesis test that compares observed frequencies to expected frequencies using the chi-square statistic to assess independence or.

The Formula

Ο‡2=βˆ‘(Observedβˆ’Expected)2Expected\chi^2 = \sum \frac{(\text{Observed} - \text{Expected})^2}{\text{Expected}}

When to use: You expect a die to land on each face about 16\frac{1}{6} of the time. You roll it 600 times and compare what you observed to what you expected. If the differences are small, the die is probably fair. If they're large, something is off. The chi-square statistic measures 'how far off are the observed counts from what we expected?'

Quick Example

Roll a die 60 times. Expected: 10 per face. Observed: 8, 12, 11, 7, 13, 9. Ο‡2=(8βˆ’10)210+(12βˆ’10)210+β‹―+(9βˆ’10)210=2.8\chi^2 = \frac{(8-10)^2}{10} + \frac{(12-10)^2}{10} + \cdots + \frac{(9-10)^2}{10} = 2.8 Compare to Ο‡2\chi^2 critical value with df=5df = 5. Not significantβ€”die appears fair.

Notation

Ο‡2\chi^2 is the test statistic. Degrees of freedom: goodness-of-fit df=kβˆ’1df = k - 1; independence/homogeneity df=(rβˆ’1)(cβˆ’1)df = (r-1)(c-1).

What This Formula Means

A hypothesis test that compares observed frequencies to expected frequencies using the chi-square statistic to assess independence or goodness of fit.

You expect a die to land on each face about 16\frac{1}{6} of the time. You roll it 600 times and compare what you observed to what you expected. If the differences are small, the die is probably fair. If they're large, something is off. The chi-square statistic measures 'how far off are the observed counts from what we expected?'

Formal View

Ο‡2=βˆ‘i=1k(Oiβˆ’Ei)2Ei\chi^2 = \sum_{i=1}^{k} \frac{(O_i - E_i)^2}{E_i} where OiO_i is observed count and EiE_i is expected count; df=kβˆ’1df = k - 1 (GOF) or (rβˆ’1)(cβˆ’1)(r-1)(c-1) (independence)

Worked Examples

Example 1

medium
A die is rolled 60 times. Observed: 1β†’8, 2β†’12, 3β†’9, 4β†’11, 5β†’13, 6β†’7. Conduct a chi-square goodness-of-fit test at Ξ±=0.05\alpha=0.05.

Answer

Ο‡2=2.8<11.07\chi^2 = 2.8 < 11.07. Fail to reject H0H_0. No evidence the die is unfair.

First step

1
Expected under H0H_0 (fair die): E=60/6=10E = 60/6 = 10 for each outcome

See the full worked solution + why-it-works coaching

SetupKey insightWhy it worksCommon pitfallConnection

Unlock answer keys One Family plan β€” every worked solution, all subjects

Example 2

hard
A 2Γ—2 table: Men: 30 prefer A, 20 prefer B. Women: 15 prefer A, 35 prefer B. Test independence of gender and preference at Ξ±=0.05\alpha=0.05.

Example 3

medium
Observed counts: 40,30,3040,30,30. Expected: 33.3,33.3,33.333.3,33.3,33.3. Compute Ο‡2\chi^2.

Common Mistakes

  • Dividing by Observed instead of Expected - the formula is (Oβˆ’E)2E\frac{(O-E)^2}{E}, scaling each gap by the expected count.
  • Using percentages or means as the data - chi-square requires raw category counts, not proportions or averages.
  • Using the wrong degrees of freedom - goodness-of-fit uses df=kβˆ’1df=k-1; a two-way independence test uses df=(rβˆ’1)(cβˆ’1)df=(r-1)(c-1).

Why This Formula Matters

It's the standard test for categorical data β€” the place a mean-based t-test simply doesn't apply because there's nothing to average. Knowing that big squared deviations relative to expected counts drive a large Ο‡2\chi^2 is what connects 'the counts look off' to a real, p-value-backed conclusion about fairness or association. Recognizing it by "Are the data counts of cases falling into categories, being compared to expected counts (rather than means or continuous values)?" β€” rather than by familiar numbers β€” is what lets a student tell it apart from two-sample t-test and two-proportion z-test and correlation / lsrl in a mixed problem set.

Frequently Asked Questions

What is the Chi-Square Test formula?

A hypothesis test that compares observed frequencies to expected frequencies using the chi-square statistic to assess independence or goodness of fit.

How do you use the Chi-Square Test formula?

You expect a die to land on each face about 16\frac{1}{6} of the time. You roll it 600 times and compare what you observed to what you expected. If the differences are small, the die is probably fair. If they're large, something is off. The chi-square statistic measures 'how far off are the observed counts from what we expected?'

What do the symbols mean in the Chi-Square Test formula?

Ο‡2\chi^2 is the test statistic. Degrees of freedom: goodness-of-fit df=kβˆ’1df = k - 1; independence/homogeneity df=(rβˆ’1)(cβˆ’1)df = (r-1)(c-1).

Why is the Chi-Square Test formula important in Math?

It's the standard test for categorical data β€” the place a mean-based t-test simply doesn't apply because there's nothing to average. Knowing that big squared deviations relative to expected counts drive a large Ο‡2\chi^2 is what connects 'the counts look off' to a real, p-value-backed conclusion about fairness or association. Recognizing it by "Are the data counts of cases falling into categories, being compared to expected counts (rather than means or continuous values)?" β€” rather than by familiar numbers β€” is what lets a student tell it apart from two-sample t-test and two-proportion z-test and correlation / lsrl in a mixed problem set.

What do students get wrong about Chi-Square Test?

The procedure for chi-square test is the easy part; the trap is dividing by Observed instead of Expected. Asking "Are the data counts of cases falling into categories, being compared to expected counts (rather than means or continuous values)?" first is what keeps a correct-looking calculation from being attached to the wrong concept.

What should I learn before the Chi-Square Test formula?

Before studying the Chi-Square Test formula, you should understand: hypothesis testing, p value, probability.