What Is a Chi-Square Test?
Karl Pearson published the chi-square test in 1900 in Philosophical Magazine. He needed a test for a question nobody had a clean answer to: do these observed counts match what you expected?
The math is straightforward — for each category, take (observed minus expected), square it, divide by expected, and sum everything up. The bigger that number, the harder it is to blame the mismatch on sampling noise.
The reason chi-square tests show up everywhere from genetics to marketing is that categorical data is the most common type of data people actually collect — survey responses, treatment outcomes, product preferences, demographic breakdowns. Continuous measurements get means and t-tests, but the moment your data falls into buckets, chi-square is the test that answers whether the distribution across those buckets is what you expected or whether something more interesting is going on.
Types of Chi-Square Tests
Goodness of Fit
Tests whether one categorical variable matches a theoretical distribution. The classic example is rolling a die 600 times and checking whether each face shows up roughly 100 times — if some faces appear far more or less often than expected, the goodness of fit test catches it. You can also test against any custom expected distribution, not just uniform.
χ² = Σ((Oᵢ - Eᵢ)² / Eᵢ), df = k - 1
Test of Independence
Tests whether two categorical variables in a contingency table are related or independent. You observe a table of counts — say treatment group versus recovery status — and the test tells you whether the pattern of counts suggests the variables are linked or whether random chance could explain everything you see.
χ² = ΣΣ((Oᵢⱼ - Eᵢⱼ)² / Eᵢⱼ), df = (r-1)(c-1)
Assumptions and Limitations
- Expected frequency ≥ 5: Every expected cell count should be at least 5 for the chi-square approximation to hold. When counts drop below that threshold, Fisher's exact test (published 1934) gives exact probabilities without relying on the approximation.
- Independent observations: Each observation must fall into exactly one cell and be independent of every other observation — repeated measures on the same subjects violate this assumption.
- Categorical data only: Chi-square works on counts, not continuous measurements. If your data is on an interval or ratio scale, you need a different test.
Frequently Asked Questions
What is a chi-square test?
It compares how often things actually happen to how often you expected them to happen. Karl Pearson published it in 1900 specifically for this purpose — whenever your data is counts in categories rather than continuous measurements, chi-square is the standard first step.
When should I use goodness of fit vs test of independence?
Goodness of fit is for one variable — does this die look fair, do these survey responses match the expected proportions. Test of independence is for two variables — are treatment group and outcome related, or could the pattern in this contingency table be explained by chance alone.
What happens when expected counts are below 5?
The chi-square approximation breaks down with small expected counts because the distribution no longer matches the theoretical chi-square curve closely enough. Fisher's exact test handles small samples without relying on this approximation — it calculates exact probabilities directly.