Chi-Square Test Formula
The Formula
When to use: You expect a die to land on each face about \frac{1}{6} of the time. You roll it 600 times and compare what you observed to what you expected. If the differences are small, the die is probably fair. If they're large, something is off. The chi-square statistic measures 'how far off are the observed counts from what we expected?'
Quick Example
Notation
What This Formula Means
A family of hypothesis tests that use the chi-square statistic to compare observed frequencies to expected frequencies. The three main types are: goodness-of-fit (does data match a claimed distribution?), test of independence (are two categorical variables related?), and test of homogeneity (do different populations have the same distribution?).
You expect a die to land on each face about \frac{1}{6} of the time. You roll it 600 times and compare what you observed to what you expected. If the differences are small, the die is probably fair. If they're large, something is off. The chi-square statistic measures 'how far off are the observed counts from what we expected?'
Formal View
Worked Examples
Example 1
mediumSolution
- 1 Expected under H_0 (fair die): E = 60/6 = 10 for each outcome
- 2 \chi^2 = \sum \frac{(O-E)^2}{E} = \frac{(8-10)^2}{10} + \frac{(12-10)^2}{10} + \frac{(9-10)^2}{10} + \frac{(11-10)^2}{10} + \frac{(13-10)^2}{10} + \frac{(7-10)^2}{10}
- 3 = \frac{4+4+1+1+9+9}{10} = \frac{28}{10} = 2.8
- 4 df = 6-1 = 5; critical value \chi^2_{0.05,5} = 11.07; since 2.8 < 11.07, fail to reject H_0
Answer
Example 2
hardCommon Mistakes
- Using chi-square on numerical (continuous) data instead of categorical (count) data.
- Forgetting to check that all expected counts are at least 5—small expected counts make the chi-square approximation unreliable.
- Confusing independence and homogeneity tests—they use the same formula but ask different questions and arise from different study designs.
Why This Formula Matters
Essential for analyzing categorical data: survey responses, genetics ratios, market research categories, and any situation where you compare proportions across groups.
Frequently Asked Questions
What is the Chi-Square Test formula?
A family of hypothesis tests that use the chi-square statistic to compare observed frequencies to expected frequencies. The three main types are: goodness-of-fit (does data match a claimed distribution?), test of independence (are two categorical variables related?), and test of homogeneity (do different populations have the same distribution?).
How do you use the Chi-Square Test formula?
You expect a die to land on each face about \frac{1}{6} of the time. You roll it 600 times and compare what you observed to what you expected. If the differences are small, the die is probably fair. If they're large, something is off. The chi-square statistic measures 'how far off are the observed counts from what we expected?'
What do the symbols mean in the Chi-Square Test formula?
\chi^2 is the test statistic. Degrees of freedom: goodness-of-fit df = k - 1; independence/homogeneity df = (r-1)(c-1).
Why is the Chi-Square Test formula important in Math?
Essential for analyzing categorical data: survey responses, genetics ratios, market research categories, and any situation where you compare proportions across groups.
What do students get wrong about Chi-Square Test?
Students struggle to distinguish the three types: goodness-of-fit tests one variable against a claimed distribution, independence tests the relationship between two variables in one sample, homogeneity tests the same variable across multiple populations.
What should I learn before the Chi-Square Test formula?
Before studying the Chi-Square Test formula, you should understand: hypothesis testing, p value, probability.