# Asymptotics: The Law of Large Numbers and The Central Limit Theorem

## 大数法则与中心极限定理

Yao Yao on September 10, 2014

## 0. Asymptotics

Asymptotics，[æsɪmp’tɒtɪks] 渐近性，其实就是讲 $\text{number of trials} \rightarrow + \infty$ 时的一些性质。

## 1. The Law of Large Numbers

### 1.1 Definition

There are many variations on the LLN; we are using a particularly lazy version here.

The law of large numbers states that if $X_1,\ldots X_n$ are iid from a population with mean $\mu$ and variance $\sigma^2$, then $\overline{X}$, the sample average of the n observations, converges in probability to $\mu$, i.e.

\begin{align} \overline{X}=\frac{1}{n}(X_1+\cdots+X_n) \newline \overline{X} \to \mu \text{ when } n \to \infty \end{align}

Or more generally, the average of the results obtained from a large number of trials (i.e. n, since we get an observation per trial) should be close to the expected value, and will tend to become closer as more trials are performed.

### 1.2 Simulation

n <- 10000;
means <- cumsum(rnorm(n)) / (1:n) ## cumsum 累积求和，e.g. cumsum(c(1,2,3)) = c(1,3,6)
plot(1:n, means, type = "l", lwd = 2, frame = FALSE, ylab = "cumulative means", xlab = "sample size")
abline(h = 0) ### 1.3 Consistency and Bias of an estimator

• An estimator is consistent if it converges to what you want to estimate, i.e. $\hat{X} \to X$
• Consistency is neither necessary nor sufficient for one estimator to be better than another
• The LLN basically states that the sample mean is consistent
• The sample variance and the sample standard deviation are consistent as well
• An estimator is unbiased if the expected value of an estimator is what its trying to estimate, i.e. $E[\hat{X}] = X$

## 2. The Central Limit Theorem

### 2.1 Definition

CLT says

$\overline{X} \to \sim \mbox{N}(\mu, \frac{\sigma^2}{n}) \text{ when } n \to \infty$

In another word

\begin{align} \frac{\overline X - \mu}{\sigma / \sqrt{n}} &= \frac{\mbox{Estimate} - \mbox{Mean of estimate}}{\mbox{Std. Err. of estimate}} \newline & \to \sim \mbox{N}(0,1) \text{ when } n \to \infty \end{align}

### 2.2 Confidence intervals

$[\overline{X} - \frac{2\sigma}{\sqrt n}, \overline{X} + \frac{2\sigma}{\sqrt n}]$ is called a 95% interval for $\mu$.

### 2.3 Apply CLT to Bernoulli estimators

\begin{align} \because \sigma^2 &= p(1 - p) \newline \therefore \frac{2\sigma}{\sqrt n} &= 2 \sqrt{\frac{p(1 - p)}{n}} \newline \because p(1-p) &\leq \frac{1}{4}, \text{for}\, 0 \leq p \leq 1 \newline \therefore \frac{2\sigma}{\sqrt n} &= 2 \sqrt{\frac{p(1 - p)}{n}} \leq 2 \sqrt{\frac{1}{4n}} = \frac{1}{\sqrt{n}} \end{align}

$\therefore \overline X \pm \frac{1}{\sqrt{n}}$ is a quick CI estimate for $p$ (since $\mu = p$ in Bernoulli)

#### Exercise I

What is the probability of getting 45 or fewer heads out 100 flips of a fair coin? (Use the CLT, not the exact binomial calculation)

• $\mu = p = 0.5$
• $\sigma^2 = p*(1-p) = 0.25, \frac{\sigma}{\sqrt{100}} = 0.05$
• $\overline X = \frac{45}{100} = 0.45$
pnorm(0.45, mean=0.5, sd=0.05)
##  0.1586553


#### Exercise II

Your campaign advisor told you that in a random sample of 100 likely voters, 56 intent to vote for you. Can you relax? Do you have this race in the bag?

• $\overline X = \frac{56}{100} = 0.56$
• $\frac{1}{\sqrt{100}} = 0.1$
• an approximate 95% interval of p is [0.46, 0.66]
• Not enough for you to relax, better go do more campaigning!

### 2.4 Calculate Poisson interval with R

A nuclear pump failed 5 times out of 94.32 days, give a 95% confidence interval for the failure rate per day (i.e. $\lambda$)?

poisson.test(x, T = 94.32)\$conf
##  0.01721 0.12371
## attr(,"conf.level")
##  0.95