Skip to main content
  1. Grade 10 Mathematics/
  2. Statistics/

Five-Number Summary, Box Plots & Data Analysis

The Five-Number Summary & Box-and-Whisker Plots
#

The five-number summary is the backbone of Grade 10 statistics. It condenses an entire data set into 5 key values, which you then use to draw a box-and-whisker plot — the most important graph in this section.


Step 1: Sort the Data
#

Always sort from smallest to largest before doing anything. This is the most common mistake in statistics — students skip sorting and get wrong quartiles.


Step 2: Find the Five Numbers
#

ValueWhat it isHow to find it
MinimumSmallest valueFirst number after sorting
$Q_1$ (Lower Quartile)25th percentileMedian of the bottom half
$Q_2$ (Median)50th percentileMiddle value of the full data set
$Q_3$ (Upper Quartile)75th percentileMedian of the top half
MaximumLargest valueLast number after sorting

Finding the Median ($Q_2$)
#

  • If $n$ is odd: median = middle value at position $\frac{n+1}{2}$
  • If $n$ is even: median = average of the two middle values

Finding $Q_1$ and $Q_3$
#

Split the data into two halves at the median. If $n$ is odd, exclude the median from both halves. Then find the median of each half.


Worked Example
#

Data (already sorted): $3;\; 5;\; 7;\; 8;\; 10;\; 12;\; 14;\; 16;\; 18$

$n = 9$ (odd)

Median ($Q_2$): position $\frac{9+1}{2} = 5$th value = 10

Bottom half (exclude median): $3;\; 5;\; 7;\; 8$ $Q_1 = \frac{5 + 7}{2} = 6$

Top half (exclude median): $12;\; 14;\; 16;\; 18$ $Q_3 = \frac{14 + 16}{2} = 15$

Min$Q_1$$Q_2$$Q_3$Max
36101518

Measures of Spread
#

MeasureFormulaWhat it tells you
RangeMax $-$ Min = $18 - 3 = 15$Total spread
IQR$Q_3 - Q_1 = 15 - 6 = 9$Spread of the middle 50%

💡 The IQR is more reliable than the range because it ignores extreme values (outliers). Exam questions often ask “which is the better measure of spread?” — the answer is usually IQR.


Drawing a Box-and-Whisker Plot
#

  1. Draw a number line to scale covering the full range
  2. Mark the 5 values on the number line
  3. Draw a box from $Q_1$ to $Q_3$
  4. Draw a vertical line inside the box at the median ($Q_2$)
  5. Draw whiskers (horizontal lines) from the box to the minimum and maximum

Reading a Box Plot
#

FeatureInterpretation
Median centred in boxData is symmetric
Median closer to $Q_1$Positively skewed (tail to the right)
Median closer to $Q_3$Negatively skewed (tail to the left)
Short box, long whiskersData has extreme values but the middle 50% is consistent
Long boxThe middle 50% of the data is very spread out

Comparing Two Box Plots
#

When asked to compare two data sets using box plots:

  1. Compare the medians — which group performed better overall?
  2. Compare the IQRs — which group was more consistent?
  3. Compare the ranges — which group had more extreme variation?
  4. Comment on skewness — are the distributions similar or different?

Grouped Data
#

When data is given in class intervals (e.g., 40–50, 50–60, …):

  • You cannot find the exact five-number summary
  • Use the midpoint of each class to estimate the mean: midpoint $= \frac{\text{lower} + \text{upper}}{2}$
  • Use an ogive (cumulative frequency curve) to estimate $Q_1$, $Q_2$, and $Q_3$

Estimated Mean from a Frequency Table
#

$$\bar{x} = \frac{\sum f \times x_{\text{mid}}}{\sum f}$$

where $f$ = frequency and $x_{\text{mid}}$ = midpoint of each class.

Drawing and Reading an Ogive (Cumulative Frequency Curve)
#

An ogive plots cumulative frequency against the upper boundary of each class. It lets you estimate the median and quartiles for grouped data.

Worked Example: 50 students’ test scores:

ClassFrequencyCumulative FrequencyUpper Boundary
$20 \leq x < 30$$3$$3$$30$
$30 \leq x < 40$$7$$10$$40$
$40 \leq x < 50$$12$$22$$50$
$50 \leq x < 60$$15$$37$$60$
$60 \leq x < 70$$9$$46$$70$
$70 \leq x < 80$$4$$50$$80$

How to draw: Plot each (upper boundary, cumulative frequency) point: $(30;\, 3)$, $(40;\, 10)$, $(50;\, 22)$, $(60;\, 37)$, $(70;\, 46)$, $(80;\, 50)$. Start the curve at $(20;\, 0)$. Connect with a smooth S-shaped curve.

How to read quartiles:

  • Median ($Q_2$): $\frac{50}{2} = 25$th value → go across from $25$ on the $y$-axis to the curve, then down to the $x$-axis → ≈ 52
  • $Q_1$: $\frac{50}{4} = 12.5$th value → read across from $12.5$ → ≈ 42
  • $Q_3$: $\frac{3 \times 50}{4} = 37.5$th value → read across from $37.5$ → ≈ 61

⚠️ Common ogive errors: Always plot against the upper boundary, NOT the midpoint. Start the curve at the lower boundary of the first class with cumulative frequency = 0.


🚨 Common Mistakes
#

  1. Not sorting data first: You MUST sort before finding the median and quartiles.
  2. Including the median in both halves: When $n$ is odd, the median itself is excluded from both the bottom and top halves when finding $Q_1$ and $Q_3$.
  3. Box plot not to scale: The number line must be drawn to scale — spacing must be proportional.
  4. Confusing range and IQR: Range = Max $-$ Min. IQR = $Q_3 - Q_1$. They measure different things.
  5. Grouped data: Don’t try to find exact quartiles from grouped data — use midpoints for the mean and an ogive for quartiles.

💡 Pro Tip
#

If a question asks “which measure of central tendency best represents the data?”:

  • Symmetric data → mean and median are similar, either works
  • Skewed data or outliers → the median is better (it’s not pulled by extreme values)

🔗 Related Grade 10 topics:

📌 Where this leads in Grade 11: Statistics: Standard Deviation & Variance — measuring spread numerically with $\sigma$


🏠 Back to Statistics

Related

Special Quadrilaterals & Parallel Lines

Master parallel line angle pairs, triangle properties and congruence, the mid-point theorem, special quadrilateral properties, and how to write geometry proofs — with full worked examples and exam strategies.