Percentiles

Percentile: the value below which a percentage of data falls.

Example: You are the fourth tallest person in a group of 20

80% of people are shorter than you:

That means you are at the 80th percentile.

If your height is 1.85m then "1.85m" is the 80th percentile height in that group.

In Order

The data needs to be in order!

To calculate percentiles of height the data needs to be in height order (sorted by height).
To calculate percentiles of age the data needs to be in age order.
And so on.

Deciles

A related idea is Deciles (sounds like decimal and percentile together), which splits the data into 10% groups:

  • The 1st decile is the 10th percentile (the value that divides the data so that 10% is below it)
  • The 2nd decile is the 20th percentile (the value that divides the data so that 20% is below it)
  • etc!

Example: (continued)

You are at the 8th decile (the 80th percentile).

Quartiles

Another related idea is Quartiles, which splits the data into quarters:

Example: 1, 3, 3, 4, 5, 6, 6, 7, 8, 8

The numbers are in order. Cut the list into quarters:

Quartiles

In this case Quartile 2 is half way between 5 and 6:

Q2 = (5+6)/2 = 5.5

And the result is:

  • Quartile 1 (Q1) = 3
  • Quartile 2 (Q2) = 5.5
  • Quartile 3 (Q3) = 7

The Quartiles also divide the data into divisions of 25%, so:

  • Quartile 1 (Q1) can be called the 25th percentile
  • Quartile 2 (Q2) can be called the 50th percentile
  • Quartile 3 (Q3) can be called the 75th percentile

Example: (continued)

For 1, 3, 3, 4, 5, 6, 6, 7, 8, 8:

  • The 25th percentile = 3
  • The 50th percentile = 5.5
  • The 75th percentile = 7

Estimating Percentiles

We can estimate percentiles from a line graph.

Example: Shopping

A total of 10,000 people visited the shopping mall over 12 hours:

Time (hours) People
0 0
2 350
4 1100
6 2400
8 6500
10 8850
12 10,000

a) Estimate the 30th percentile (when 30% of the visitors had arrived).

b) Estimate what percentile of visitors had arrived after 11 hours.

First draw a line graph of the data: plot the points and join them with a smooth curve:

 

a) The 30th percentile occurs when the visits reach 3,000.

Draw a line horizontally across from 3,000 until you hit the curve, then draw a line vertically downwards to read off the time on the horizontal axis:

So the 30th percentile occurs after about 6.5 hours.

 

b) To estimate the percentile of visits after 11 hours: draw a line vertically up from 11 until you hit the curve, then draw a line horizontally across to read off the population on the horizontal axis:

So the visits at 11 hours were about 9,500, which is the 95th percentile.