Menu
Advertisement
Advertisement
Advertisement

Describing Numerical Data

DESCRIBING NUMERICAL DATA

Chapter 4, Describing Numerical Data, focuses on how to categorise, organise, and summarise quantitative information using mathematical tools and graphical representations.


1. Types of Numerical Variables

Numerical data is split into two main types based on how the values are generated:

πŸ’‘

Discrete Variables

These involve a count of distinct items.

Examples: The number of people in a household or the number of accidents in a month.

πŸ’‘

Continuous Variables

These involve a measurement of something and can take any value within an interval.

Examples: A person’s weight, height, or the speed of a vehicle.


2. Organising Numerical Data

Data must be organised to be understandable.

  • Discrete Data: If there are few distinct values, they are listed in a frequency table where each value is treated as its own category.
  • Continuous Data: This is grouped into classes (bins).
    • Terminology: The Lower/Upper class limits are the smallest/largest values in a class. The Class width is the difference between consecutive lower limits, and the Class mark is the average of the two limits.
πŸ’‘

Stem-and-Leaf Diagram

A β€œstemplot” splits each number into a stem (all but the rightmost digit) and a leaf (the rightmost digit).

Example: The number 75 is shown as 7 | 5.


3. Measures of Central Tendency

These measures identify the β€œtypical” value or the centre of the dataset .

πŸ’‘

Mean (Xˉ\bar{X})

The arithmetic average.

  • Formula: XΛ‰=βˆ‘Xin\bar{X} = \frac{\sum X_i}{n} (Sum of observations divided by the number of observations).
  • Properties:
    • If you add a constant cc to every value, the new mean is XΛ‰+c\bar{X} + c.
    • If you multiply every value by cc, the new mean is XΛ‰Γ—c\bar{X} \times c.
πŸ’‘

Median

The middle value in an ordered list.

  • Calculation: For odd nn, it is the middle observation; for even nn, it is the average of the two middle observations.
  • Insight: Unlike the mean, the median is not sensitive to outliers.
πŸ’‘

Mode

The most frequently occurring value. If no value repeats, there is no mode.


4. Measures of Dispersion (Variation)

These quantify the β€œspread” or variability of the data .

  1. Range: The difference between the largest and smallest values (Maxβˆ’MinMax - Min). It is highly sensitive to outliers.
  2. Variance (S2S^2): Measures how much data values deviate from the mean.
    • Formula (Sample): S2=βˆ‘(Xiβˆ’XΛ‰)2nβˆ’1S^2 = \frac{\sum (X_i - \bar{X})^2}{n - 1}.
    • Property: Adding a constant does not change the variance. Multiplying by cc changes the variance by c2c^2.
  3. Standard Deviation: The square root of the variance. It is expressed in the same units as the original data (e.g., kg or years).

5. Relative Standing and Summaries

  • Percentiles: The 100p100p percentile is the value below which 100p100p percent of the data falls.
  • Quartiles: These divide the data into four equal parts.
    • Q1 (First Quartile): 25th percentile.
    • Q2 (Median): 50th percentile.
    • Q3 (Third Quartile): 75th percentile.
  • Interquartile Range (IQR): The difference between Q3 and Q1 (Q3βˆ’Q1Q3 - Q1).
  • Five Number Summary: A set consisting of: Minimum, Q1, Median, Q3, and Maximum.

Practice Session

Q1

Mean and Median

Calculate the mean and median for the dataset: 2,12,5,7,6,7,32, 12, 5, 7, 6, 7, 3.

View Detailed Solution β–Ό
  1. Mean: 2+12+5+7+6+7+37=427=6\frac{2+12+5+7+6+7+3}{7} = \frac{42}{7} = \mathbf{6}
  2. Median:
    • Order the data: 2,3,5,6,7,7,122, 3, 5, 6, 7, 7, 12.
    • n=7n=7 (odd). The middle (4th4^{th}) observation is 6.
Q2

Dispersion

Find the range of the dataset: 1,2,3,4,151, 2, 3, 4, 15.

View Detailed Solution β–Ό
  • Max=15,Min=1Max = 15, \quad Min = 1
  • Range=15βˆ’1=14Range = 15 - 1 = \mathbf{14}
Q3

Percentiles

Find the 25th percentile for the ordered data: 35,38,47,58,61,66,68,68,70,7935, 38, 47, 58, 61, 66, 68, 68, 70, 79 (n=10n=10).

View Detailed Solution β–Ό
  • n=10,p=0.25n=10, \quad p=0.25.
  • Calculate np=10Γ—0.25=2.5np = 10 \times 0.25 = 2.5.
  • Since 2.52.5 is a decimal, round up to the 3rd position.
  • The 3rd observation is 47.
Q4

Five Number Summary

Identify the five-number summary for the ordered data: 11,16,18,26,27,28,28,29,35,3711, 16, 18, 26, 27, 28, 28, 29, 35, 37.

View Detailed Solution β–Ό
  • Minimum: 11.
  • Q1: 3rd value = 18.
  • Median (Q2): Average of 5th and 6th = 27+282=27.5\frac{27+28}{2} = 27.5.
  • Q3: 8th value = 29.
  • Maximum: 37.
  • Summary: 11, 18, 27.5, 29, 37.
Sponsored Content

finding (solutions) x

A public notebook and learning hub.