Menu
Advertisement
Advertisement
Advertisement

Sampling Distributions and Limit Theorems

LIMIT THEOREMS

Chapter 8, Sampling Distributions and Limit Theorems , delves into the crucial theoretical concepts that underpin almost all statistical inference. This chapter explains what happens to sample statistics, like the mean (Xˉ\bar{X}), when the sample size (nn) becomes very large, focusing on two foundational results: the Weak Law of Large Numbers (WLLN) and the Central Limit Theorem (CLT).


8.1 Setting the Stage: Multi-dimensional Continuous Variables

To study sample statistics, we first need a framework for discussing several random variables simultaneously. Since a sample {X1,X2,,Xn}\{X_1, X_2, \dots, X_n\} consists of nn random variables, understanding their relationships is essential.

Concepts of Joint Distributions

  1. Joint Distribution Function (FF): This function gives the probability that all variables fall below specified values: F(x1,x2,,xn)=P(X1x1,X2x2,,Xnxn)F(x_1, x_2, \dots, x_n) = P(X_1 \leq x_1, X_2 \leq x_2, \dots, X_n \leq x_n)

  2. Joint Density (ff): For continuous variables, probability is found by integrating the joint density function f(x1,,xn)f(x_1, \dots, x_n) over the desired region.

  3. Independence is Key: If the variables are mutually independent (a fundamental assumption for most random samples), their joint density function is simply the product of their individual marginal densities: f(x1,x2,,xn)=fX1(x1)fX2(x2)fXn(xn)f(x_1, x_2, \dots, x_n) = f_{X_1}(x_1) f_{X_2}(x_2) \cdots f_{X_n}(x_n)

Order Statistics

When we observe a sample, arranging the values from smallest to largest gives us the order statistics. X(1)X(2)X(n)X_{(1)} \le X_{(2)} \le \cdots \le X_{(n)}.

Q1

CDF of the Maximum

Question: If X1,,XnX_1, \dots, X_n are i.i.d. samples, what is the Cumulative Distribution Function (CDF) of the maximum value, X(n)X_{(n)}?

View Detailed Solution

Solution: The maximum X(n)X_{(n)} is less than or equal to xx if and only if all XiX_i are less than or equal to xx. Due to independence: F(n)(x)=P(X(n)x)=i=1nP(Xix)=(F(x))nF_{(n)}(x) = P(X_{(n)} \leq x) = \prod_{i=1}^{n} P(X_i \leq x) = (F(x))^n


8.2 The Weak Law of Large Numbers (WLLN)

The WLLN is the first major limit theorem, providing formal confirmation of the common intuition that as you gather more data, the sample average gets closer to the true population average.

Concept: Convergence in Probability

💡

Convergence in Probability

The WLLN describes convergence in probability (p\xrightarrow{p}). A sequence of random variables XnX_n converges to XX in probability if, for any tiny distance ϵ\epsilon, the chance that XnX_n is further away from XX than ϵ\epsilon eventually goes to zero as nn increases: limnP(XnX>ϵ)=0\lim_{n \to \infty} P(|X_n - X| > \epsilon) = 0

The WLLN Theorem

If X1,X2,X_1, X_2, \dots are i.i.d. random variables with finite mean μ\mu and finite variance σ2\sigma^2, then the sample mean Xˉn\bar{X}_n converges in probability to μ\mu: limnP(Xˉnμ>ϵ)=0\lim_{n \to \infty} P(|\bar{X}_n - \mu| > \epsilon) = 0

Proof Insight: Applying Chebyshev’s Inequality to Xˉn\bar{X}_n: P(Xˉnμ>ϵ)Var[Xˉn]ϵ2=σ2nϵ2P(|\bar{X}_n - \mu| > \epsilon) \leq \frac{Var[\bar{X}_n]}{\epsilon^2} = \frac{\sigma^2}{n\epsilon^2} As nn \to \infty, the upper bound σ2/(nϵ2)\sigma^2/(n\epsilon^2) goes to zero.

Application: Sample Proportion

The WLLN formally validates the intuitive link between theoretical probability and observed frequency. The sample proportion (p^\hat{p}) converges to the true probability (pp).


8.3 Convergence in Distribution

Convergence in distribution (d\xrightarrow{d}) describes how the shape of the distribution of a random variable sequence approaches a limiting shape.

The Power of Moment Generating Functions (MGFs)

The most practical way to prove convergence in distribution is often through Moment Generating Functions (MGFs).

MGF Convergence Theorem: If the MGFs Mn(t)M_n(t) of a sequence XnX_n exist and converge to M(t)M(t), and M(t)M(t) is the MGF of some random variable XX, then XnX_n converges in distribution to XX.

Example: Proving that Binomial approaches Poisson when nn \to \infty and p0p \to 0 (np=λnp = \lambda) is done by showing the limit of the Binomial MGF equals the Poisson MGF eλ(et1)e^{\lambda(e^t - 1)}.


8.4 The Central Limit Theorem (CLT)

The Central Limit Theorem is arguably the most fundamental result in statistics, explaining why the Normal distribution appears so frequently.

The CLT Theorem

💡

Central Limit Theorem

Let X1,X2,X_1, X_2, \dots be i.i.d. random variables with finite mean μ\mu and finite variance σ2\sigma^2. If we standardize the sample mean Xˉn\bar{X}_n:

Yn=n(Xˉnμ)σ=SnnμσnY_n = \frac{\sqrt{n}(\bar{X}_n - \mu)}{\sigma} = \frac{S_n - n\mu}{\sigma \sqrt{n}}

Then, as nn \to \infty, YnY_n converges in distribution to the Standard Normal distribution, ZNormal(0,1)Z \sim \text{Normal}(0, 1).

n(Xˉnμ)σdZ\frac{\sqrt{n}(\bar{X}_n - \mu)}{\sigma} \xrightarrow{d} Z

Insight: The sampling distribution of the sample mean approaches a Normal distribution, even if the original population distribution is highly non-normal.

Application: Normal Approximation

Q2

Normal Approximation

Question: YGamma(100,4)Y \sim \text{Gamma}(100, 4) is the sum of n=100n=100 independent Exponential(4)\text{Exponential}(4) random variables. Approximate P(20<Y30)P(20 < Y \leq 30).

View Detailed Solution

Solution:

  1. Identify Parameters: For Exponential(4), μ=0.25\mu = 0.25, σ=0.25\sigma = 0.25.
  2. Sum Parameters: E[S100]=100(0.25)=25E[S_{100}] = 100(0.25) = 25 and SD[S100]=0.25100=2.5SD[S_{100}] = 0.25\sqrt{100} = 2.5.
  3. Standardize: P(20<S10030)P(20252.5<Z30252.5)=P(2<Z2)P(20 < S_{100} \leq 30) \approx P\left( \frac{20 - 25}{2.5} < Z \leq \frac{30 - 25}{2.5} \right) = P(-2 < Z \leq 2)
  4. Calculate: P(2<Z2)0.9544P(-2 < Z \leq 2) \approx \mathbf{0.9544}.

Continuity Correction

When approximating discrete, integer-valued RVs (like Binomial), extend the interval by 0.5 units: P(a<Snb)P(a+0.5nμσn<Zb+0.5nμσn)P(a < S_n \leq b) \approx P\left( \frac{a + 0.5 - n\mu}{\sigma \sqrt{n}} < Z \leq \frac{b + 0.5 - n\mu}{\sigma \sqrt{n}} \right)

Race Car Analogy: The Weak Law of Large Numbers tells you the race car (the sample mean) will eventually converge and finish the race at the true mean (μ\mu). The Central Limit Theorem describes how the car travels to that finish line, showing that its distribution around the finish line (when standardized) always follows the same predictable, bell-shaped path (the Normal distribution), regardless of how the race started.

Advertisement

All Chapters in this Book

Lesson 1

Basic Concepts

Foundational mathematical framework for probability, including definitions, axioms, conditional probability, and Bayes' Theorem.

Lesson 2

Sampling and Repeated Trials

Models based on repeated independent trials, focusing on Bernoulli trials and sampling methods.

Lesson 3

Discrete Random Variables

Formalizing random variables, probability mass functions, and independence.

Lesson 4

Summarizing Discrete Random Variables

Deriving numerical characteristics—expected value, variance, and standard deviation—to summarize behavior of discrete random variables.

Lesson 5

Continuous Probabilities and Random Variables

Transitioning from discrete sums to continuous integrals, density functions, and key distributions like Normal and Exponential.

Lesson 6

Summarising Continuous Random Variables

Extending expected value and variance to continuous variables, exploring Moment Generating Functions and Bivariate Normal distributions.

Lesson 7

Sampling and Descriptive Statistics

Transitioning from probability to statistics: using sample data to estimate population parameters like mean and variance.

Lesson 8

Sampling Distributions and Limit Theorems

The theoretical foundations of inference: Joint Distributions, Weak Law of Large Numbers (WLLN), and geometrical convergence via the Central Limit Theorem (CLT).

Lesson 9

Estimation and Hypothesis Testing

The core of statistical inference: Method of Moments, Maximum Likelihood, Confidence Intervals, and Hypothesis Testing.

Lesson 10

Linear Regression

Modeling linear relationships, least squares, and regression inference.

Sponsored Content

finding (solutions) x

A public notebook and learning hub.