There are a variety of descriptive statistics. To name a few, numbers such as the mean, median, mode, skewness, kurtosis, standard deviation, first quartile and third quartile each tell us something about our data. Rather than looking at these descriptive statistics individually, sometimes combining them will help to give us a more complete picture. With this end in mind, the five number summary is a convenient way to combine five descriptive statistics.
Which Five Numbers?
It is clear that there are to be five numbers in our summary, but which five? The numbers chosen are to help us know the center of our data, as well as how spread out the data points are. With this in mind, the five number summary consists of the following:
- The minimum – this is the smallest value in our data set.
- The first quartile – this number is denoted Q1 and 25% of our data falls below the first quartile.
- The median – this is the midway point of the data. 50% of all data falls below the median.
- The third quartile – this number is denoted Q3 and 75% of our data falls below the third quartile.
- The maximum – this is the largest value in our data set.
The mean and standard deviation can also be used together to convey the center and the spread of a set of data. However, both of these statistics are susceptible to outliers. The median, first quartile and third quartile are not as heavily influenced by outliers.
Given the following set of data, we will report the five number summary:
1, 2, 2, 3, 4, 6, 6, 7, 7, 7, 8, 11, 12, 15, 15, 15, 17, 17, 18, 20
There are a total of twenty points in the data set. The median is thus the average of the tenth and eleventh data values or:
(7 + 8)/2 = 7.5.
The median of the bottom half of the data is the first quartile. The bottom half is:
1, 2, 2, 3, 4, 6, 6, 7, 7, 7
Thus we calculateQ1= (4 + 6)/2 = 5.
The median of the top half of the original data set is the third quartile. We need to find the median of:
8, 11, 12, 15, 15, 15, 17, 17, 18, 20
Thus we calculateQ3= (15 + 15)/2 = 15.
We assemble all of the above results together, and report that the five number summary for the above set of data is 1, 5, 7.5, 12, 20.
Five number summaries can be compared to one another. We will find that two sets with the similar means and standard deviations may have very different five number summaries. To easily compare two five number summaries at a glance, we can use a boxplot, or box and whiskers graph.