There are several ways to measure the center of a set of data. The mean, median, mode and midrange all have their advantages and limitations in expressing the middle of the data. Of all of these ways to find the average, the median is the most resistant to outliers. It marks the middle of the data in the sense that half of the data is less than the median.
There’s no reason we have to stop at finding just the middle. What if we decided to continue this process? We could calculate the median of the bottom half of our data. One half of 50% is 25%. Thus half of half, or one quarter, of the data would be below this. Since we are dealing with a quarter of the original set, this median of the bottom half of the data is called the first quartile, and is denoted by Q1.
The Third Quartile
There is no reason why we looked at the bottom half of the data. Instead we could have looked at the top half and performed the same steps as above. The median of this half, which we will denote by Q3 also splits the data set into quarters. However, this number denotes the top one quarter of the data. Thus three quarters of the data is below our number Q3. This is why we call Q3 the third quartile (and this explains the 3 in the notation.
To make this all clear, let’s look at an example. It may be helpful to first review how to calculate the median of some data. Start with the following data set:
1, 2, 2, 3, 4, 6, 6, 7, 7, 7, 8, 11, 12, 15, 15, 15, 17, 17, 18, 20
There are a total of twenty data points in the set. We begin by finding the median. Since there is an even number of data values, the median is the mean of the tenth and eleventh values. In other words, the median is:
(7 + 8)/2 = 7.5.
Now look at the bottom half of the data. The median of this half is found between the fifth and sixth values of:
1, 2, 2, 3, 4, 6, 6, 7, 7, 7
Thus the first quartile is found to equal Q1 = (4 + 6)/2 = 5
To find the third quartile, look at the top half of the original data set. We need to find the median of:
8, 11, 12, 15, 15, 15, 17, 17, 18, 20
Here the median is (15 + 15)/2 = 15. Thus the third quartile Q3 = 15.
Interquartile Range and Five Number Summary
Quartiles help to give us a fuller picture of our data set as a whole. The first and third quartiles give us information about the internal structure of our data. The middle half of the data falls between the first and third quartiles, and is centered about the median. The difference between the first and third quartiles, called the interquartile range, shows how the data is arranged about the median. A small interquartile range indicates data that is clumped about the median. A larger interquartile range shows that the data is more spread out.
A more detailed picture of the data can be obtained by knowing the highest value, called the maximum value, and the lowest value, called the minimum value. The minimum, first quartile, median, third quartile and maximum are a set of five values called the five number summary. An effective way to display these five numbers is called a boxplot or box and whisker graph.