Basic Concepts

Chapter 1 of the “Stats 2 Book” establishes the fundamental vocabulary and axiomatic framework necessary for studying probability and statistics. This chapter introduces the structure needed to discuss the likelihood of occurrences.

Here is a detailed explanation of the key concepts, followed by illustrative examples and exercises based on the source material:

The foundation of probability theory rests on defining what is possible before determining what is likely.

Key Definitions

Concept	Definition	Example
Sample Space ( $S$ )	A set listing all possibilities (outcomes) that might occur.	When rolling a six-sided die, $S = \{1, 2, 3, 4, 5, 6\}$ .
Outcome	An element of the sample space $S$ .	The result of rolling a die, e.g., ‘4’.
Experiment	The process of actually selecting one of the outcomes listed.	Flipping a coin or waiting for the winner of the World Cup.
Event ( $E$ )	Any subset of the sample space $S$ .	Rolling a number > 2: $E = \{3, 4, 5, 6\}$ .

Probability Axioms

A probability ( $P$ ) is a function that assigns a chance (a number between 0 and 1) to each event $E$ . This formally relies on Kolmogorov’s axioms.

💡

Axiom 1: Certainty

The probability of the entire sample space is 1. $P(S) = 1$ Interpretation: There is a 100% chance that an experiment will result in some outcome included in $S$ .

💡

Axiom 2: Additivity

For any countable collection of disjoint events ( $E_1, E_2, \dots$ ), their combined probability is the sum of their individual probabilities. $P (E_1 \cup E_2 \cup \dots) = P(E_1) + P(E_2) + \dots$ Interpretation: If events cannot happen simultaneously, their probabilities add up.

Basic Properties

From the two fundamental axioms, several properties can be proven that simplify probability calculations:

Property	Formula	Description
Empty Set	$P(\emptyset) = 0$	The probability of nothing happening is zero.
Finite Additivity	$P(\cup E_i) = \sum P(E_i)$	Sum rule for finite disjoint events.
Monotonicity	If $E \subset F, P(E) \le P(F)$	Subsets cannot be more likely than supersets.
Difference Rule	$P(F \setminus E) = P(F) - P(E)$	Prob. of $F$ occurring but $E$ not (if $E \subset F$ ).
Complement Rule	$P(E^c) = 1 - P(E)$	Prob. that $E$ does NOT occur.
General Addition	$P(E \cup F) = P(E) + P(F) - P(E \cap F)$	Subtract intersection to avoid double counting.

Example and Solution

Coin Flip Axioms

A fair coin flip has a sample space $S = \{\text{heads, tails}\}$ . Use the axioms to show that the probability of observing heads is 0.5.

View Detailed Solution ▼

Let $E = \{\text{heads}\}$ and $F = \{\text{tails}\}$ be two disjoint events.
Since the coin is “fair,” $P(E) = P(F) = p$ for some value $p$ .
The union is the sample space: $S = E \cup F$ .
Using Axiom 1 and Axiom 2: $1 = P(S) = P(E \cup F) = P(E) + P(F)$
Substituting $p$ : $1 = 2p \implies p = 0.5$ .

Fishing Tonnage

A town’s fishing fleet has a 35% chance of catching over 400 tons ( $P(F)=0.35$ ) and a 10% chance of catching over 500 tons ( $P(E)=0.10$ ). How likely is it that they will catch between 400 and 500 tons ?

View Detailed Solution ▼

Variable $E$ (“over 500”) is a subset of $F$ (“over 400”).
“Between 400 and 500” is the difference set $F \setminus E$ .
Using the Difference Rule: $P(F \setminus E) = P(F) - P(E)$

Answer: There is a 25% chance that between 400 and 500 tons of fish will be caught.

When outcomes are finite and equally likely, we have a Uniform Distribution.

Running probability calculations in this setting becomes a pure counting problem.

Rolling Two Dice

Two dice are rolled. How likely is it that their sum will equal eight?

View Detailed Solution ▼

Total sample space $|S| = 6 \times 6 = 36$ .
Event $E$ (sum is 8): $E = \{(2, 6), (3, 5), (4, 4), (5, 3), (6, 2)\}$ .
Count $|E| = 5$ .
Probability: $P(E) = \frac{5}{36}$

Group Selection

A group of 12 people includes Grant and Dilip. Pick 3 randomly. How likely is it to include Grant, but not Dilip?

Combinatorial Solution ▼

Total Outcomes: Choose 3 from 12. $|S| = \binom{12}{3} = 220$ .
Event $E$ : Grant is fixed (1 way). Dilip is out. We need 2 more from the remaining 10.
Count $|E|$ : Choose 2 from 10. $|E| = \binom{10}{2} = 45$ .
Probability: $P(E) = \frac{45}{220} = \frac{9}{44}$

How is likelihood “altered” by knowledge that another event $B$ has occurred?

💡

Conditional Probability Formula

$P(A|B) = \frac{P(A \cap B)}{P(B)}$ Provided $P(B) > 0$

Bayes’ Theorem

One of the most powerful theorems in statistics allows us to reverse conditional probabilities.

💡

Bayes' Theorem

$P(B|A) = \frac{P(A|B)P(B)}{\sum P(A|B_j)P(B_j)}$ Updates belief $P(B)$ given new evidence $A$ .

The Swine Flu Test

Detects Flu 95% of the time if infected: $P(Pos|Flu) = 0.95$ .
False Positive rate is 2%: $P(Pos|Healthy) = 0.02$ .
Population rate is 1%: $P(Flu) = 0.01$ .

If a person tests positive, what is the probability they actually have the flu?

Bayesian Check ▼

Let $A = \text{Flu}$ , $B = \text{Positive Test}$ .
We want $P(A|B)$ .
Apply Bayes’ Theorem: $P(A|B) = \frac{P(B|A)P(A)}{P(B|A)P(A) + P(B|A^c)P(A^c)}$ $P(A|B) = \frac{(0.95)(0.01)}{(0.95)(0.01) + (0.02)(0.99)}$ $P(A|B) \approx 0.324$

Result: Only a 32.4% chance they actually have the flu!

Events are independent if the occurrence of one has no effect on the other.

$P(A \cap B) = P(A)P(B)$

R is essential for complex calculations.

Function	Syntax	Purpose
Vector Creation	`c(1, 2, 3)`	Creates a list of numbers.
Sequence	`1:100`	Creates integers from 1 to 100.
Combinations	`choose(n, k)`	Calculates $\binom{n}{k}$ .
Summation	`sum(x)`	Adds all elements in vector $x$ .

All Chapters in this Book

Lesson 1

Basic Concepts

Foundational mathematical framework for probability, including definitions, axioms, conditional probability, and Bayes' Theorem.

Lesson 2

Sampling and Repeated Trials

Models based on repeated independent trials, focusing on Bernoulli trials and sampling methods.

Lesson 3

Discrete Random Variables

Formalizing random variables, probability mass functions, and independence.

Lesson 4

Summarizing Discrete Random Variables

Deriving numerical characteristics—expected value, variance, and standard deviation—to summarize behavior of discrete random variables.

Lesson 5

Continuous Probabilities and Random Variables

Transitioning from discrete sums to continuous integrals, density functions, and key distributions like Normal and Exponential.

Lesson 6

Summarising Continuous Random Variables

Extending expected value and variance to continuous variables, exploring Moment Generating Functions and Bivariate Normal distributions.

Lesson 7

Sampling and Descriptive Statistics

Transitioning from probability to statistics: using sample data to estimate population parameters like mean and variance.

Lesson 8

Sampling Distributions and Limit Theorems

The theoretical foundations of inference: Joint Distributions, Weak Law of Large Numbers (WLLN), and geometrical convergence via the Central Limit Theorem (CLT).

Lesson 9

Estimation and Hypothesis Testing

The core of statistical inference: Method of Moments, Maximum Likelihood, Confidence Intervals, and Hypothesis Testing.

Lesson 10

Linear Regression

Modeling linear relationships, least squares, and regression inference.

Basic Concepts

Key Definitions

Probability Axioms

Axiom 1: Certainty

Axiom 2: Additivity

Basic Properties

Example and Solution

Coin Flip Axioms

Fishing Tonnage

Rolling Two Dice

Group Selection

Conditional Probability Formula

Bayes’ Theorem

Bayes' Theorem

The Swine Flu Test

All Chapters in this Book

Basic Concepts

Sampling and Repeated Trials

Discrete Random Variables

Summarizing Discrete Random Variables

Continuous Probabilities and Random Variables

Summarising Continuous Random Variables

Sampling and Descriptive Statistics

Sampling Distributions and Limit Theorems

Estimation and Hypothesis Testing

Linear Regression

finding (solutions) x