Chi-Square Test

Statistics
process

Also known as: χ² test, chi-squared test

Grade 9-12

View on concept map

A hypothesis test that compares observed frequencies to expected frequencies using the chi-square statistic to assess independence or goodness of fit. Essential for analyzing categorical data: survey responses, genetics ratios, market research categories, and any situation where you compare proportions across groups.

Definition

A hypothesis test that compares observed frequencies to expected frequencies using the chi-square statistic to assess independence or goodness of fit.

💡 Intuition

You expect a die to land on each face about \frac{1}{6} of the time. You roll it 600 times and compare what you observed to what you expected. If the differences are small, the die is probably fair. If they're large, something is off. The chi-square statistic measures 'how far off are the observed counts from what we expected?'

🎯 Core Idea

Chi-square tests work with categorical (count) data, not numerical measurements. Large \chi^2 values mean the observed data deviates significantly from what was expected under H_0.

Example

Roll a die 60 times. Expected: 10 per face. Observed: 8, 12, 11, 7, 13, 9. \chi^2 = \frac{(8-10)^2}{10} + \frac{(12-10)^2}{10} + \cdots + \frac{(9-10)^2}{10} = 2.8 Compare to \chi^2 critical value with df = 5. Not significant—die appears fair.

Formula

\chi^2 = \sum \frac{(\text{Observed} - \text{Expected})^2}{\text{Expected}}

Notation

\chi^2 is the test statistic. Degrees of freedom: goodness-of-fit df = k - 1; independence/homogeneity df = (r-1)(c-1).

🌟 Why It Matters

Essential for analyzing categorical data: survey responses, genetics ratios, market research categories, and any situation where you compare proportions across groups.

💭 Hint When Stuck

Compute each cell's contribution: \frac{(\text{observed} - \text{expected})^2}{\text{expected}}, then sum them all. Compare the total to a chi-square table with the correct degrees of freedom. Large values suggest the variables are not independent.

Formal View

\chi^2 = \sum_{i=1}^{k} \frac{(O_i - E_i)^2}{E_i} where O_i is observed count and E_i is expected count; df = k - 1 (GOF) or (r-1)(c-1) (independence)

🚧 Common Stuck Point

Students struggle to distinguish the three types: goodness-of-fit tests one variable against a claimed distribution, independence tests the relationship between two variables in one sample, homogeneity tests the same variable across multiple populations.

⚠️ Common Mistakes

  • Using chi-square on numerical (continuous) data instead of categorical (count) data.
  • Forgetting to check that all expected counts are at least 5—small expected counts make the chi-square approximation unreliable.
  • Confusing independence and homogeneity tests—they use the same formula but ask different questions and arise from different study designs.

Frequently Asked Questions

What is Chi-Square Test in Math?

A hypothesis test that compares observed frequencies to expected frequencies using the chi-square statistic to assess independence or goodness of fit.

What is the Chi-Square Test formula?

\chi^2 = \sum \frac{(\text{Observed} - \text{Expected})^2}{\text{Expected}}

When do you use Chi-Square Test?

Compute each cell's contribution: \frac{(\text{observed} - \text{expected})^2}{\text{expected}}, then sum them all. Compare the total to a chi-square table with the correct degrees of freedom. Large values suggest the variables are not independent.

How Chi-Square Test Connects to Other Ideas

To understand chi-square test, you should first be comfortable with hypothesis testing, p value and probability.