Estimation and Hypothesis Testing

Chapter 9, Estimation and Hypothesis Testing , marks the bridge from theoretical probability and the study of sample statistics (Chapters 7 and 8) to the core application of statistical inference. This chapter explores methods used to leverage observed data to make formal statements, educated guesses, or concrete judgments about unknown characteristics of the underlying population distribution.

In statistics, we often assume an observed sample ( $X_1, X_2, \dots, X_n$ ) comes from a distribution with an unknown shape or parameter(s) ( $p_1, p_2, \dots, p_d$ , denoted as $p$ ). The goal of estimation is to use the sample data to provide a best guess for these unknown parameters.

Concept: Point Estimator

💡

Point Estimator

A point estimator is a function, $g(X_1, X_2, \dots, X_n)$ , that takes the sample values as input and produces a single value (the estimate) intended to approximate the true parameter(s).

Example: If $X$ has an unknown mean $\mu$ , the sample mean ( $\bar{X}$ ) is an unbiased estimator of $\mu$ .

The Method of Moments relies on matching sample moments to population moments.

Concept

Sample Moment: $m_k = \frac{1}{n} \sum_{i=1}^{n} X_i^k$
Population Moment: $\mu_k = E[X^k]$ (function of unknown parameters).
Solve: Set $m_k = \mu_k$ for $k=1 \dots d$ and solve for parameters.

Example: Normal Distribution ( $\mu, \sigma^2$ )

$\hat{\mu} = \bar{X}$
$\hat{\sigma}^2 = \frac{1}{n} \sum X_i^2 - (\bar{X})^2$

The MLE method determines the parameter ( $\hat{p}$ ) that makes the observed data “most likely”.

Concept: Likelihood Function

💡

Likelihood Function

The likelihood function, $L(p)$ , is the joint probability of the sample as a function of the parameters:

$L(p; X_1, \dots, X_n) = \prod_{i=1}^{n} f(X_i | p)$

The MLE ( $\hat{p}$ ) comes from maximising this function (or usually $\ln L$ ).

Example: For Bernoulli( $p$ ), the MLE is the sample proportion $\hat{p} = \bar{X}$ .

A confidence interval provides a range of values within which the true unknown parameter is expected to lie with a specified confidence level ( $\beta$ ).

Construction

Known $\sigma$ (Z-interval): Uses the Standard Normal distribution ( $Z$ ). $Z = \frac{\sqrt{n}(\bar{X} - \mu)}{\sigma}$
Unknown $\sigma$ (T-interval): Uses the t-distribution ( $t_{n-1}$ ) and sample deviation $S$ . $T = \frac{\sqrt{n}(\bar{X} - \mu)}{S}$

Key Difference: T-intervals are typically wider than Z-intervals, reflecting the extra uncertainty of estimating $\sigma$ .

Hypothesis testing is a procedure to judge the plausibility of a conjecture about a parameter.

Core Concepts

Null Hypothesis ( $H_0$ ): Baseline statement (e.g., $\mu = 10$ ).
Alternate Hypothesis ( $H_a$ ): Contradiction (e.g., $\mu \neq 10$ ).
Significance Level ( $\alpha$ ): Threshold for rejection (e.g., 0.05).
P-value: Probability of seeing data this extreme if $H_0$ were true.

Standardized Tests

Test Name	Purpose	Test Statistic	Null Hypothesis
Z-test	Test $\mu$ ( $\sigma$ known/ $n$ large)	$Z = \frac{\bar{X} - \mu_0}{\sigma/\sqrt{n}}$	$\mu = \mu_0$
T-test	Test $\mu$ ( $\sigma$ unknown)	$T = \frac{\bar{X} - \mu_0}{S/\sqrt{n}}$	$\mu = \mu_0$
$\chi^2$ -test	Test $\sigma^2$	$W = \frac{(n-1)S^2}{\sigma_0^2}$	$\sigma = \sigma_0$
F-test	Compare two variances	$R = \frac{S_1^2}{S_2^2}$	$\sigma_1 = \sigma_2$

Example: Z-test on Sample Mean

Hypothesis Test

Scenario: Normal population, $\sigma=3.0$ , $n=16$ , $\bar{X}=10.2$ . Test $H_0: \mu = 9.5$ vs $H_a: \mu > 9.5$ at $\alpha=0.05$ .

View Detailed Solution ▼

Test Statistic: $Z_{\text{obs}} = \frac{\sqrt{16}(10.2 - 9.5)}{3.0} \approx 0.933$
P-value: $P(Z \geq 0.933) \approx 0.175$ .
Conclusion: Since $0.175 > 0.05$ , we do not reject $H_0$ .

Goodness of Fit ( $\chi^2$ )

Test if observed categorical data matches expected counts.

💡

Chi-Square Statistic

$\chi^2 = \sum_{j=1}^{k} \frac{(Y_j - np_j)^2}{np_j}$

Analogy: Estimation is like trying to guess the location of a hidden treasure. Point estimation gives you a single set of coordinates, while a confidence interval gives you a small map. Hypothesis testing is like testing a rumour about the treasure’s location ( $H_0$ ) against the evidence found in the field.

All Chapters in this Book

Lesson 1

Basic Concepts

Foundational mathematical framework for probability, including definitions, axioms, conditional probability, and Bayes' Theorem.

Lesson 2

Sampling and Repeated Trials

Models based on repeated independent trials, focusing on Bernoulli trials and sampling methods.

Lesson 3

Discrete Random Variables

Formalizing random variables, probability mass functions, and independence.

Lesson 4

Summarizing Discrete Random Variables

Deriving numerical characteristics—expected value, variance, and standard deviation—to summarize behavior of discrete random variables.

Lesson 5

Continuous Probabilities and Random Variables

Transitioning from discrete sums to continuous integrals, density functions, and key distributions like Normal and Exponential.

Lesson 6

Summarising Continuous Random Variables

Extending expected value and variance to continuous variables, exploring Moment Generating Functions and Bivariate Normal distributions.

Lesson 7

Sampling and Descriptive Statistics

Transitioning from probability to statistics: using sample data to estimate population parameters like mean and variance.

Lesson 8

Sampling Distributions and Limit Theorems

The theoretical foundations of inference: Joint Distributions, Weak Law of Large Numbers (WLLN), and geometrical convergence via the Central Limit Theorem (CLT).

Lesson 9