# Fisher's Exact Test for Independence

## To test whether two categorical variables are independent

Yao Yao on June 9, 2015

Fisher’s Exact Test is named after its inventor, Sir R. A. Fisher, and is one of a class of exact tests, so called because the significance of the deviation from a null hypothesis (e.g., p-value) can be calculated exactly, rather than relying on an approximation that becomes exact in the limit as the sample size grows to infinity, as with many statistical tests.

Fisher’s Exact Test 的 hypotheses 以及用途和 Chi-Square Test 完全一致，都是 to test whether two categorical variables are independent.

• $H_0$: variable $A$ and variable $B$ are independent.
• $H_a$: variable $A$ and variable $B$ are not independent.

• Chi-Square Test 对 sample size 是有要求的，不能太小
• Fisher’s Exact Test 在 sample size 小的时候仍然适用

$p = \frac{ {a+b \choose a} {c+d \choose c} }{ {n \choose a+c} } = \frac{(a+b)! (c+d)! (a+c)!(b+d)!}{a!b!c!d!n!}$
• 因为 $a+c = n - (b+d)$，所以分母写 ${n \choose a+c}$ 或者 ${n \choose b+d}$ 是一样的。

To generate a significance level, we need consider only the cases where the marginal totals are the same as in the observed table, and among those, only the cases where the arrangement is as extreme as the observed arrangement, or more so. (Barnard’s test relaxes this constraint on one set of the marginal totals.)

• 已知的情况：$\begin{bmatrix}1 & 9 \newline 11 & 3 \end{bmatrix}$, 计算得 $p_1 = 0.001346076$
• 正方向更极端的情况：$\begin{bmatrix}0 & 10 \newline 12 & 2 \end{bmatrix}$, 计算得 $p_2 = 0.000033652$
• 反方向同样极端的情况：$\begin{bmatrix}9 & 1 \newline 3 & 11 \end{bmatrix}$, 计算得 $p_3 = p_1 = 0.001346076$
• 反方向更极端的情况：$\begin{bmatrix}10 & 0 \newline 2 & 12 \end{bmatrix}$, 计算得 $p_4 = p_2 =0.000033652$
• 如果我们只考虑正方向的两种情况，则我们做的是 one-tailed test，得到的 $\operatorname{p-value} = p_1 + p_2 = 0.001379728$ 也称为 one-tailed p-value 或者 1-sided p-value。
• 如果我们正反两个方向都考虑，则我们做的是 two-tailed test，得到的 $\operatorname{p-value} = p_1 + p_2 + p_3 + p_4 = 0.002759456$ 也称为 two-tailed p-value 或者 2-sided p-value。
• 一般情况下我们都应该做 two-tailed test，除非有客观的理由（比如反方向的情况是不存在的）