Summarizing Discrete Random Variables

Chapter 4 of the “Stats 2 Book”, Summarizing Discrete Random Variables , focuses on deriving numerical characteristics—like the average and the spread—that summarize the behaviour of a random variable (RV) without listing every possible outcome. These summaries allow you to understand the “typical” result and the likelihood of unusual results.

The chapter covers five main concepts: Expected Value (Average), Variance and Standard Deviation (Spread), Conditional Expectation, and Covariance and Correlation (Relationships between RVs).

The expected value ( $E[X]$ ), or average, of a discrete random variable $X$ is essentially a long-run weighted average of all its possible outcomes. It tells you what value you should expect, on average, if you repeated the experiment many times.

Concept and Calculation

💡

Expected Value Formula

The calculation of the expected value relies on weighting each possible outcome ( $t$ ) by its likelihood ( $P(X=t)$ ) and summing these products: $E[X] = \sum_{t} t \cdot P(X=t)$

For beginner intuition, consider rolling a standard fair die. The possible outcomes are $t \in \{1, 2, 3, 4, 5, 6\}$ , each with a probability $P(X=t) = 1/6$ . $E[X] = 1(1/6) + 2(1/6) + 3(1/6) + 4(1/6) + 5(1/6) + 6(1/6) = 3.5$

Example: Expected Value of a Lottery Ticket

Lottery Valuation

Example (4.1.2): A lottery ticket can be worth £200, £20, or nothing (£0).

$P(X=200) = 1/1000$
$P(X=20) = 27/1000$
$P(X=0) = 972/1000$

Question: What is the average value of such a ticket?

View Detailed Solution ▼

Solution: Applying the definition of expected value: $E[X] = 200\left(\frac{1}{1000}\right) + 20\left(\frac{27}{1000}\right) + 0\left(\frac{972}{1000}\right) = 0.20 + 0.54 + 0 = 0.74$

Answer: The expected value of the ticket is £0.74 (or 74 pence).

Key Properties of Expected Value

The expectation of a sum of random variables is the sum of their individual expectations, regardless of whether they are independent.

\mathbf{E[aX + bY] = aE[X] + bE[Y]}

X

and

Y

are independent discrete random variables, the expected value of their product is the product of their expected values.

\mathbf{E[XY] = E[X]E[Y]}

(Note: This property generally fails if $X$ and $Y$ are dependent.)

Application: Expected Values of Common Distributions

Using these rules simplifies calculating the expectations for fundamental distributions:

Distribution	Parameter(s)	Expected Value ( $E[X]$ )
Bernoulli( $p$ )	$p \in$ (Success/Failure)	$\mathbf{p}$
Binomial( $n, p$ )	$n$ trials, $p$ success probability	$\mathbf{np}$
Geometric( $p$ )	Trials until 1st success	$\mathbf{1/p}$

The expected number of successes in $n$ independent Bernoulli trials (Binomial) is simply the sum of the individual Bernoulli expected values: $E[X_1 + \dots + X_n] = p + \dots + p = np$ .

While the expected value tells you the central point of the distribution, it does not tell you how spread out the outcomes are.

Concept and Calculation

💡

Variance & Standard Deviation

The variance ( $Var[X]$ ) is the average of the squared distances between each outcome $X$ and the mean $E[X]$ . $\mathbf{Var[X] = E[(X - E[X])^2]}$

The standard deviation ( $SD[X]$ or $\sigma$ ) is the square root of the variance. The standard deviation is typically viewed as the typical distance from the average.

Key Properties of Variance

Alternate Formula: Variance is often easier to compute using the second moment ( $E[X^2]$ ): $\mathbf{Var[X] = E[X^2] - (E[X])^2}$
Scaling: When scaling a random variable by a constant $a$ , the variance scales by $a^2$ : $\mathbf{Var[aX] = a^2 \cdot Var[X]}$
Shifting: Adding a constant $a$ (a location shift) does not change the spread: $\mathbf{Var[X + a] = Var[X]}$
Independence (Sum Rule): If $X$ and $Y$ are independent random variables, their variances add: $\mathbf{Var[X + Y] = Var[X] + Var[Y]}$

Example: Variance of a Die Roll

Die Roll Spread

Question: What are the variance and standard deviation of a single roll of a fair die ( $E[X]=3.5$ )?

View Detailed Solution ▼

Solution: The variance is the average squared distance from the mean 3.5: $Var[X] = \frac{1}{6}\left( (1-3.5)^2 + (2-3.5)^2 + (3-3.5)^2 + (4-3.5)^2 + (5-3.5)^2 + (6-3.5)^2 \right)$ $Var[X] = \frac{35}{12} \approx 2.917$

The Standard Deviation is $SD[X] = \sqrt{35/12} \approx \mathbf{1.71}$ . This confirms that a typical deviation from the mean 3.5 is about 1.71.

Variance of Common Distributions

Distribution	Variance ( $Var[X]$ )
Bernoulli( $p$ )	$\mathbf{p(1-p)}$
Binomial( $n, p$ )	$\mathbf{np(1-p)}$
Geometric( $p$ )	$\mathbf{(1-p)/p^2}$

When discussing variability, it is useful to express outcomes in standard units—the number of standard deviations ( $\sigma$ ) an outcome is from the expected value ( $\mu$ ).

Chebyshev’s Inequality

The Chebyshev’s Inequality provides a universal upper bound on the probability that a random variable $X$ will deviate significantly from its mean, regardless of the specific shape of the distribution.

💡

Chebyshev's Inequality

For any random variable $X$ with finite variance $\sigma^2$ , and for any value $k>0$ : $\mathbf{P(|X - \mu| \geq k\sigma) \leq \frac{1}{k^2}}$

Explanation: This inequality states that the chance of an outcome being $k$ or more standard deviations away from the average is at most $1/k^2$ .

Example (4.3.5)

Chebyshev Bound

Example (4.3.5): Find an upper bound on the likelihood that $X$ will be more than two standard deviations from its expected value.

View Detailed Solution ▼

Solution: Set $k=2$ : $P(|X - \mu| \geq 2\sigma) \leq \frac{1}{2^2} = \frac{1}{4}$

Conclusion: There is at most a 25% chance that a random variable will be more than two standard deviations from its expected value.

Conditional expected value ( $E[X|A]$ ) is the expected value of $X$ given that some event $A$ has definitely occurred. The conditioning event $A$ alters the underlying probabilities, thus potentially changing the average outcome.

Law of Total Expectation

This theorem relates the overall expected value $E[X]$ to conditional expected values over a set of disjoint events $\{B_i\}$ that cover the entire sample space ( $S$ ).

💡

Law of Total Expectation

$\mathbf{E[X] = \sum_{i} E[X|B_i]P(B_i)}$

Example (4.4.5)

Investment Return

Example (4.4.5): A venture capitalist estimates the expected return on an investment ( $X$ ) conditional on economic outlooks $A$ (stronger), $B$ (same), or $C$ (weaker):

$E[X|A] = 3$ million (P(A) = 0.1)
$E[X|B] = 1$ million (P(B) = 0.4)
$E[X|C] = -1$ million (P(C) = 0.5)

Question: What is the overall expected return ( $E[X]$ )?

View Detailed Solution ▼

Solution: Using the Law of Total Expectation: $E[X] = E[X|A]P(A) + E[X|B]P(B) + E[X|C]P(C)$ $E[X] = 3(0.1) + 1(0.4) + (-1)(0.5) = 0.3 + 0.4 - 0.5 = 0.2$

The expected return on investment is $0.2 million (£200,000).

When analyzing two random variables, $X$ and $Y$ , the goal is often to determine how they relate to each other—if knowing one changes your expectation of the other.

Covariance ( $Cov[X, Y]$ )

The covariance measures the degree to which two random variables vary together. $Cov[X,Y] = E[(X - E[X])(Y - E[Y])]$

Positive Covariance ( $> 0$ ): $X$ tends to be above average when $Y$ is above average. They are positively correlated.
Negative Covariance ( $< 0$ ): $X$ tends to be above average when $Y$ is below average. They are negatively correlated.
Zero Covariance ( $= 0$ ): $X$ and $Y$ are uncorrelated.

Property: If $X$ and $Y$ are independent, then $Cov[X,Y] = 0$ . (The converse is not necessarily true).

The calculation is often simplified using the alternate formula: $Cov[X,Y] = E[XY] - E[X]E[Y]$

Correlation ( $\rho[X, Y]$ )

The correlation coefficient ( $\rho$ ) is the dimensionless, standardized version of the covariance. It is standardized by dividing the covariance by the product of the standard deviations ( $\sigma_X \sigma_Y$ ).

💡

Correlation Coefficient

$\rho[X,Y] = \frac{Cov[X,Y]}{\sigma_X \sigma_Y}$

Range and Interpretation: The correlation coefficient is always bounded between $\mathbf{-1}$ and $\mathbf{1}$ .

$+1$ : Strong positive linear relationship.
$-1$ : Strong negative linear relationship.

All Chapters in this Book

Lesson 1

Basic Concepts

Foundational mathematical framework for probability, including definitions, axioms, conditional probability, and Bayes' Theorem.

Lesson 2

Sampling and Repeated Trials

Models based on repeated independent trials, focusing on Bernoulli trials and sampling methods.

Lesson 3

Discrete Random Variables

Formalizing random variables, probability mass functions, and independence.

Lesson 4

Summarizing Discrete Random Variables

Deriving numerical characteristics—expected value, variance, and standard deviation—to summarize behavior of discrete random variables.

Lesson 5

Continuous Probabilities and Random Variables

Transitioning from discrete sums to continuous integrals, density functions, and key distributions like Normal and Exponential.

Lesson 6

Summarising Continuous Random Variables

Extending expected value and variance to continuous variables, exploring Moment Generating Functions and Bivariate Normal distributions.

Lesson 7

Sampling and Descriptive Statistics

Transitioning from probability to statistics: using sample data to estimate population parameters like mean and variance.

Lesson 8

Sampling Distributions and Limit Theorems

The theoretical foundations of inference: Joint Distributions, Weak Law of Large Numbers (WLLN), and geometrical convergence via the Central Limit Theorem (CLT).

Lesson 9

Estimation and Hypothesis Testing

The core of statistical inference: Method of Moments, Maximum Likelihood, Confidence Intervals, and Hypothesis Testing.

Lesson 10

Linear Regression

Modeling linear relationships, least squares, and regression inference.

Summarizing Discrete Random Variables

Concept and Calculation

Expected Value Formula

Example: Expected Value of a Lottery Ticket

Lottery Valuation

Key Properties of Expected Value

Application: Expected Values of Common Distributions

Concept and Calculation

Variance & Standard Deviation

Key Properties of Variance

Example: Variance of a Die Roll

Die Roll Spread

Variance of Common Distributions

Chebyshev’s Inequality

Chebyshev's Inequality

Example (4.3.5)

Chebyshev Bound

Law of Total Expectation

Law of Total Expectation

Example (4.4.5)

Investment Return

Covariance (Cov[X,Y]Cov[X, Y]Cov[X,Y])

Correlation (ρ[X,Y]\rho[X, Y]ρ[X,Y])

Correlation Coefficient

All Chapters in this Book

Basic Concepts

Sampling and Repeated Trials

Discrete Random Variables

Summarizing Discrete Random Variables

Continuous Probabilities and Random Variables

Summarising Continuous Random Variables

Sampling and Descriptive Statistics

Sampling Distributions and Limit Theorems

Estimation and Hypothesis Testing

Linear Regression

finding (solutions) x

Covariance ( $Cov[X, Y]$ )

Correlation ( $\rho[X, Y]$ )