Association Between Two Variables
ASSOCIATION BETWEEN TWO VARIABLES
Chapter 5 of the sources is titled “Association between two variables” and explores how information about one variable can provide insights into another. The chapter is divided into three main sections based on the types of variables being compared: categorical-categorical , numerical-numerical , and categorical-numerical .
1. Association Between Two Categorical Variables
To determine if an association exists between two categorical variables, researchers use a contingency table.
Criteria for Association
- Not Associated: The row (or column) relative frequencies are the same for all rows (or columns).
- Associated: If these frequencies differ significantly across rows or columns.
Example: Smartphone Ownership
Female Ownership: 77.27% | Male Ownership: 75%
Since these values are nearly identical, Gender and Ownership are not associated.
2. Association Between Two Numerical Variables
The sources define several tools to examine the relationship between two quantitative variables.
- Scatter Plots: A visual test where pairs of values are displayed as points on a two-dimensional plane.
- Describing Patterns: Check for Direction (up/down), Curvature (linear/curve), Variation (clustering), and Outliers.
- Covariance: Quantifies linear association strength but units are difficult to interpret.
- Pearson Correlation Coefficient (): Unitless measure between -1 and +1.
- Fitting a Line and : Goodness of fit; closer to 1 signifies a good fit.
Example: Car Age vs Price
A study of car ages and prices showed that as age increases, price decreases.
- Trend: Negative Linear Association ().
3. Association Between Categorical and Numerical Variables
When one variable is numerical and the other is categorical with exactly two categories (dichotomous), the Point Bi-serial Correlation Coefficient () is used.
Procedure
You group the numerical data based on the two categories (often coded as 0 and 1) and compare their means relative to the overall population standard deviation.
Practice Questions
Categorical Association
In a study of 10,000 students, 80% of males passed an exam and 79.9% of females passed. Is there an association between gender and passing?
View Detailed Solution ▼
No. Because the row relative frequencies (80% and 79.9%) are effectively the same for both rows, the variables are not associated.
Covariance Units
If the covariance between weight (kg) and height (m) is calculated, what are its units?
View Detailed Solution ▼
The units are kg m.
Numerical Trend
A scatter plot shows that as the size of a house increases, the price also increases in a straight line. How would you describe this association?
View Detailed Solution ▼
The direction is upward (positive) and the curvature is linear.
Correlation Fit
A dataset has a Pearson correlation coefficient () of -0.95. What does this tell you about the fit and the relationship?
View Detailed Solution ▼
It indicates a strong negative linear association. The value would be , meaning the line is a good fit, capturing about 90% of the variance.
The Dance Partners Analogy
Think of association as two people dancing together.
- Categorical association: Checking if people in red shirts always choose partners in blue shirts.
- Numerical association: Watching how their steps move; if one person takes a step forward and the other consistently takes a step forward too, they have a positive correlation.
- The Correlation Coefficient () is the “synchronisation score”—a +1 means they are perfectly in sync, while a 0 means they are stepping on each other’s toes.
All Chapters in this Book
Statistics
Introduces the subject as the 'art of learning from data,' covering its collection, description, and analysis.
Data
Focuses on the nature of information itself and how it is categorised.
Describing Categorical Data
Visualising and identifying the 'centre' of qualitative data.
Describing Numerical Data
Tools for organising and measuring the typical values and spread of quantitative variables.
Association Between Two Variables
Explores how information about one variable can provide insight into another.
Basic Principle of Counting
Foundations of probability by teaching how to count possible outcomes.
Factorial
Defines the product of positive integers.
Permutation
Covers the various ways to calculate ordered arrangements of objects.
Combination
Focuses on the mathematical methods for selecting objects when the order of selection does not matter.