Describing Numerical Data
DESCRIBING NUMERICAL DATA
Chapter 4, Describing Numerical Data, focuses on how to categorise, organise, and summarise quantitative information using mathematical tools and graphical representations.
1. Types of Numerical Variables
Numerical data is split into two main types based on how the values are generated:
Discrete Variables
These involve a count of distinct items.
Examples: The number of people in a household or the number of accidents in a month.
Continuous Variables
These involve a measurement of something and can take any value within an interval.
Examples: A personβs weight, height, or the speed of a vehicle.
2. Organising Numerical Data
Data must be organised to be understandable.
- Discrete Data: If there are few distinct values, they are listed in a frequency table where each value is treated as its own category.
- Continuous Data: This is grouped into classes (bins).
- Terminology: The Lower/Upper class limits are the smallest/largest values in a class. The Class width is the difference between consecutive lower limits, and the Class mark is the average of the two limits.
Stem-and-Leaf Diagram
A βstemplotβ splits each number into a stem (all but the rightmost digit) and a leaf (the rightmost digit).
Example: The number 75 is shown as 7 | 5.
3. Measures of Central Tendency
These measures identify the βtypicalβ value or the centre of the dataset .
Mean ()
The arithmetic average.
- Formula: (Sum of observations divided by the number of observations).
- Properties:
- If you add a constant to every value, the new mean is .
- If you multiply every value by , the new mean is .
Median
The middle value in an ordered list.
- Calculation: For odd , it is the middle observation; for even , it is the average of the two middle observations.
- Insight: Unlike the mean, the median is not sensitive to outliers.
Mode
The most frequently occurring value. If no value repeats, there is no mode.
4. Measures of Dispersion (Variation)
These quantify the βspreadβ or variability of the data .
- Range: The difference between the largest and smallest values (). It is highly sensitive to outliers.
- Variance (): Measures how much data values deviate from the mean.
- Formula (Sample): .
- Property: Adding a constant does not change the variance. Multiplying by changes the variance by .
- Standard Deviation: The square root of the variance. It is expressed in the same units as the original data (e.g., kg or years).
5. Relative Standing and Summaries
- Percentiles: The percentile is the value below which percent of the data falls.
- Quartiles: These divide the data into four equal parts.
- Q1 (First Quartile): 25th percentile.
- Q2 (Median): 50th percentile.
- Q3 (Third Quartile): 75th percentile.
- Interquartile Range (IQR): The difference between Q3 and Q1 ().
- Five Number Summary: A set consisting of: Minimum, Q1, Median, Q3, and Maximum.
Practice Session
Mean and Median
Calculate the mean and median for the dataset: .
View Detailed Solution βΌ
- Mean:
- Median:
- Order the data: .
- (odd). The middle () observation is 6.
Dispersion
Find the range of the dataset: .
View Detailed Solution βΌ
Percentiles
Find the 25th percentile for the ordered data: ().
View Detailed Solution βΌ
- .
- Calculate .
- Since is a decimal, round up to the 3rd position.
- The 3rd observation is 47.
Five Number Summary
Identify the five-number summary for the ordered data: .
View Detailed Solution βΌ
- Minimum: 11.
- Q1: 3rd value = 18.
- Median (Q2): Average of 5th and 6th = .
- Q3: 8th value = 29.
- Maximum: 37.
- Summary: 11, 18, 27.5, 29, 37.
All Chapters in this Book
Statistics
Introduces the subject as the 'art of learning from data,' covering its collection, description, and analysis.
Data
Focuses on the nature of information itself and how it is categorised.
Describing Categorical Data
Visualising and identifying the 'centre' of qualitative data.
Describing Numerical Data
Tools for organising and measuring the typical values and spread of quantitative variables.
Association Between Two Variables
Explores how information about one variable can provide insight into another.
Basic Principle of Counting
Foundations of probability by teaching how to count possible outcomes.
Factorial
Defines the product of positive integers.
Permutation
Covers the various ways to calculate ordered arrangements of objects.
Combination
Focuses on the mathematical methods for selecting objects when the order of selection does not matter.