When to Use Fisher’s Exact Test vs. Chi-Square for 2×2 TablesUnderstanding which statistical test to use for a 2×2 contingency table often determines whether your conclusions are valid. This article explains the assumptions, strengths, limitations, and practical guidance for choosing between Fisher’s exact test and the chi-square test (specifically Pearson’s chi-square) when analyzing 2×2 tables. It includes examples, decision rules, and notes on computation and interpretation.
What each test assesses
- Fisher’s exact test computes the exact probability of observing a table as extreme as (or more extreme than) the observed table under the null hypothesis of independence, conditioning on the fixed marginal totals.
- The chi-square test approximates the sampling distribution of the test statistic by a chi-square distribution; it evaluates whether observed cell counts deviate from expected counts under independence.
Key assumptions
Fisher’s exact test
- No large-sample approximation; provides exact p-values.
- Assumes fixed marginal sums (the test conditions on row and column totals).
- Applicable regardless of sample size or small expected counts.
Chi-square test (Pearson)
- Relies on large-sample approximation: the distribution of the test statistic approximates chi-square.
- Expected cell counts should generally be sufficiently large (common rules: all expected counts ≥ 5, or at least 80% of cells ≥ 5 and none < 1).
- Observations should be independent.
When to prefer Fisher’s exact test
- Small sample sizes: Especially when one or more expected cell counts are small ().
- Rare events: If one outcome is rare, Fisher’s exact avoids the approximation errors of chi-square.
- Exact inference required: Clinical trials or regulatory settings sometimes require exact p-values.
- Unbalanced margins: Situations with very unequal row/column totals where approximation may be poor.
Bold rule: Use Fisher’s exact test when expected cell counts are small or sample sizes are small.
When the chi-square test is appropriate
- Large samples: With moderate to large sample sizes where expected counts meet recommended thresholds.
- Computational simplicity: Chi-square is computationally simpler and widely available.
- Approximate inference acceptable: Exploratory analyses or large surveys where tiny approximation error is negligible.
Bold rule: Use the chi-square test when all expected cell counts are sufficiently large (commonly ≥5).
Practical decision rule (quick checklist)
- Calculate expected counts for all four cells.
- If any expected count < 1, do not use chi-square.
- If 20% or more of cells have expected counts < 5, prefer Fisher’s exact.
- Otherwise, chi-square is acceptable.
Example comparisons
Observed 2×2 table:
Outcome A | Outcome B | Row total | |
---|---|---|---|
Group 1 | 2 | 8 | 10 |
Group 2 | 10 | 30 | 40 |
Column total | 12 | 38 | 50 |
- Expected counts:
- Group1–A: 10 * 12 / 50 = 2.4
- Group1–B: 10 * 38 / 50 = 7.6
- Group2–A: 40 * 12 / 50 = 9.6
- Group2–B: 40 * 38 / 50 = 30.4
Because one expected count is (2.4), Fisher’s exact is recommended.
One-tailed vs two-tailed tests
- Fisher’s exact test can be performed as one-tailed or two-tailed. The two-tailed version requires careful definition of “as extreme” because there are multiple ways to be more extreme in two dimensions; most statistical software provides a two-sided p-value (often using the sum of probabilities of tables at least as extreme as the observed).
- Chi-square test is inherently two-sided (tests for any departure from independence). For directional hypotheses, consider whether a one-tailed exact test is appropriate and justify it a priori.
Continuity correction (Yates’ correction)
- For 2×2 tables, some recommend Yates’ continuity correction applied to the chi-square statistic to reduce approximation error for small samples. This correction reduces type I error but can be overly conservative.
- Fisher’s exact test avoids the need for such corrections.
Computational notes and software
- Fisher’s exact test is available in R (fisher.test), Python (SciPy: fisher_exact), Stata (fisher), and most statistical packages.
- Chi-square in R: chisq.test (be aware of warning messages about expected counts and potential use of simulate.p.value or correct=TRUE).
- For large samples, chi-square is faster; for small samples Fisher is fast enough for typical 2×2 tables.
Power and sample size considerations
- Fisher’s exact test is conservative in some scenarios, potentially reducing power compared to chi-square with continuity correction. For planned studies, perform power/sample-size calculations appropriate to the chosen test.
- When possible, plan sample size to avoid very small expected counts so you can use asymptotic methods reliably.
Summary guidance
- Small expected counts or small total sample → use Fisher’s exact test.
- Large samples with adequate expected counts → chi-square is acceptable and efficient.
- Consider one-tailed exact tests only with strong directional hypotheses declared in advance.
If you want, I can:
- show R and Python code examples for both tests on sample tables;
- help decide which test to use for a specific dataset if you paste your 2×2 counts.
Leave a Reply