Sampling Distributions and Limit Theorems
LIMIT THEOREMS
Chapter 8, Sampling Distributions and Limit Theorems , delves into the crucial theoretical concepts that underpin almost all statistical inference. This chapter explains what happens to sample statistics, like the mean (), when the sample size () becomes very large, focusing on two foundational results: the Weak Law of Large Numbers (WLLN) and the Central Limit Theorem (CLT).
8.1 Setting the Stage: Multi-dimensional Continuous Variables
To study sample statistics, we first need a framework for discussing several random variables simultaneously. Since a sample consists of random variables, understanding their relationships is essential.
Concepts of Joint Distributions
-
Joint Distribution Function (): This function gives the probability that all variables fall below specified values:
-
Joint Density (): For continuous variables, probability is found by integrating the joint density function over the desired region.
-
Independence is Key: If the variables are mutually independent (a fundamental assumption for most random samples), their joint density function is simply the product of their individual marginal densities:
Order Statistics
When we observe a sample, arranging the values from smallest to largest gives us the order statistics. .
CDF of the Maximum
Question: If are i.i.d. samples, what is the Cumulative Distribution Function (CDF) of the maximum value, ?
View Detailed Solution ▼
Solution: The maximum is less than or equal to if and only if all are less than or equal to . Due to independence:
8.2 The Weak Law of Large Numbers (WLLN)
The WLLN is the first major limit theorem, providing formal confirmation of the common intuition that as you gather more data, the sample average gets closer to the true population average.
Concept: Convergence in Probability
Convergence in Probability
The WLLN describes convergence in probability (). A sequence of random variables converges to in probability if, for any tiny distance , the chance that is further away from than eventually goes to zero as increases:
The WLLN Theorem
If are i.i.d. random variables with finite mean and finite variance , then the sample mean converges in probability to :
Proof Insight: Applying Chebyshev’s Inequality to : As , the upper bound goes to zero.
Application: Sample Proportion
The WLLN formally validates the intuitive link between theoretical probability and observed frequency. The sample proportion () converges to the true probability ().
8.3 Convergence in Distribution
Convergence in distribution () describes how the shape of the distribution of a random variable sequence approaches a limiting shape.
The Power of Moment Generating Functions (MGFs)
The most practical way to prove convergence in distribution is often through Moment Generating Functions (MGFs).
MGF Convergence Theorem: If the MGFs of a sequence exist and converge to , and is the MGF of some random variable , then converges in distribution to .
Example: Proving that Binomial approaches Poisson when and () is done by showing the limit of the Binomial MGF equals the Poisson MGF .
8.4 The Central Limit Theorem (CLT)
The Central Limit Theorem is arguably the most fundamental result in statistics, explaining why the Normal distribution appears so frequently.
The CLT Theorem
Central Limit Theorem
Let be i.i.d. random variables with finite mean and finite variance . If we standardize the sample mean :
Then, as , converges in distribution to the Standard Normal distribution, .
Insight: The sampling distribution of the sample mean approaches a Normal distribution, even if the original population distribution is highly non-normal.
Application: Normal Approximation
Normal Approximation
Question: is the sum of independent random variables. Approximate .
View Detailed Solution ▼
Solution:
- Identify Parameters: For Exponential(4), , .
- Sum Parameters: and .
- Standardize:
- Calculate: .
Continuity Correction
When approximating discrete, integer-valued RVs (like Binomial), extend the interval by 0.5 units:
Race Car Analogy: The Weak Law of Large Numbers tells you the race car (the sample mean) will eventually converge and finish the race at the true mean (). The Central Limit Theorem describes how the car travels to that finish line, showing that its distribution around the finish line (when standardized) always follows the same predictable, bell-shaped path (the Normal distribution), regardless of how the race started.
All Chapters in this Book
Basic Concepts
Foundational mathematical framework for probability, including definitions, axioms, conditional probability, and Bayes' Theorem.
Sampling and Repeated Trials
Models based on repeated independent trials, focusing on Bernoulli trials and sampling methods.
Discrete Random Variables
Formalizing random variables, probability mass functions, and independence.
Summarizing Discrete Random Variables
Deriving numerical characteristics—expected value, variance, and standard deviation—to summarize behavior of discrete random variables.
Continuous Probabilities and Random Variables
Transitioning from discrete sums to continuous integrals, density functions, and key distributions like Normal and Exponential.
Summarising Continuous Random Variables
Extending expected value and variance to continuous variables, exploring Moment Generating Functions and Bivariate Normal distributions.
Sampling and Descriptive Statistics
Transitioning from probability to statistics: using sample data to estimate population parameters like mean and variance.
Sampling Distributions and Limit Theorems
The theoretical foundations of inference: Joint Distributions, Weak Law of Large Numbers (WLLN), and geometrical convergence via the Central Limit Theorem (CLT).
Estimation and Hypothesis Testing
The core of statistical inference: Method of Moments, Maximum Likelihood, Confidence Intervals, and Hypothesis Testing.
Linear Regression
Modeling linear relationships, least squares, and regression inference.