Sampling Distributions and Limit Theorems
LIMIT THEOREMS
Chapter 8, Sampling Distributions and Limit Theorems , delves into the crucial theoretical concepts that underpin almost all statistical inference. This chapter explains what happens to sample statistics, like the mean (), when the sample size () becomes very large, focusing on two foundational results: the Weak Law of Large Numbers (WLLN) and the Central Limit Theorem (CLT).
8.1 Setting the Stage: Multi-dimensional Continuous Variables
To study sample statistics, we first need a framework for discussing several random variables simultaneously. Since a sample consists of random variables, understanding their relationships is essential.
Concepts of Joint Distributions
-
Joint Distribution Function (): This function gives the probability that all variables fall below specified values:
-
Joint Density (): For continuous variables, probability is found by integrating the joint density function over the desired region.
-
Independence is Key: If the variables are mutually independent (a fundamental assumption for most random samples), their joint density function is simply the product of their individual marginal densities:
Order Statistics
When we observe a sample, arranging the values from smallest to largest gives us the order statistics. .
CDF of the Maximum
Question: If are i.i.d. samples, what is the Cumulative Distribution Function (CDF) of the maximum value, ?
📝 View Detailed Solution ▼
Solution: The maximum is less than or equal to if and only if all are less than or equal to . Due to independence:
8.2 The Weak Law of Large Numbers (WLLN)
The WLLN is the first major limit theorem, providing formal confirmation of the common intuition that as you gather more data, the sample average gets closer to the true population average.
Concept: Convergence in Probability
Convergence in Probability
The WLLN describes convergence in probability (). A sequence of random variables converges to in probability if, for any tiny distance , the chance that is further away from than eventually goes to zero as increases:
The WLLN Theorem
If are i.i.d. random variables with finite mean and finite variance , then the sample mean converges in probability to :
Proof Insight: Applying Chebyshev’s Inequality to : As , the upper bound goes to zero.
Application: Sample Proportion
The WLLN formally validates the intuitive link between theoretical probability and observed frequency. The sample proportion () converges to the true probability ().
8.3 Convergence in Distribution
Convergence in distribution () describes how the shape of the distribution of a random variable sequence approaches a limiting shape.
The Power of Moment Generating Functions (MGFs)
The most practical way to prove convergence in distribution is often through Moment Generating Functions (MGFs).
MGF Convergence Theorem: If the MGFs of a sequence exist and converge to , and is the MGF of some random variable , then converges in distribution to .
Example: Proving that Binomial approaches Poisson when and () is done by showing the limit of the Binomial MGF equals the Poisson MGF .
8.4 The Central Limit Theorem (CLT)
The Central Limit Theorem is arguably the most fundamental result in statistics, explaining why the Normal distribution appears so frequently.
The CLT Theorem
Central Limit Theorem
Let be i.i.d. random variables with finite mean and finite variance . If we standardize the sample mean :
Then, as , converges in distribution to the Standard Normal distribution, .
Insight: The sampling distribution of the sample mean approaches a Normal distribution, even if the original population distribution is highly non-normal.
Application: Normal Approximation
Normal Approximation
Question: is the sum of independent random variables. Approximate .
📝 View Detailed Solution ▼
Solution:
- Identify Parameters: For Exponential(4), , .
- Sum Parameters: and .
- Standardize:
- Calculate: .
Continuity Correction
When approximating discrete, integer-valued RVs (like Binomial), extend the interval by 0.5 units:
Race Car Analogy: The Weak Law of Large Numbers tells you the race car (the sample mean) will eventually converge and finish the race at the true mean (). The Central Limit Theorem describes how the car travels to that finish line, showing that its distribution around the finish line (when standardized) always follows the same predictable, bell-shaped path (the Normal distribution), regardless of how the race started.