11 November 2009

Box plots

Let's say you have the scores from a test that 500 people spread across 5 departments took. How can we better understand this data to make sense of how different departments performed? Here are some ideas - easiest to most involved, but also least useful to most useful:
  • Easiest: present a table of the mean scores. 
  • Median score: By choosing this instead of the average, we are making less assumptions about the spread of the scores, so perhaps the median would be a better choice
  • Sort the departments by median score
  • Create a bar chart with this sorted data
  • Create a box plot with this data
Most of us haven't created a box plot since high school. A quick reminder - a center line in a box showing the median value (typically), a box spanning this value,  showing the scores in the 25th to 75th percentile (or sometimes a standard deviation either side), and another set of bars showing the maximum and minimum, or maybe something like the 5th and 95th percentile. I've taken this standard representation of the box plot and decreased the "chart junk" - the extra lines and information that do nothing in aiding our interpretation of the data, while also improving the readability of the data (I think, anyway).

Here's the result for our 5 departments. The departments are sorted by highest to lowest median score. The median is marked in a thick red line, making it easy to quickly compare the scores. There is no axis line for the horizontal axis - no need for one. The gridlines are a very pale grey. The boxes marking the percentiles are filled, but in a light color, and the caps on the max/min have gone, but the line is a little thicker. Quickly you can begin to form opinions about the spread of the data and the comparative performance between departments.

Even better is that now this sort of chart is suitable for inclusion in a dashboard - you can reduce the size of it without sacrificing the comparisons it provides. Box plots are not standard in Excel - I'll post a tutorial on how to create ones just like these soon.

4 comments:

  1. I've written some tutorials for box plots in Excel:

    Vertical Box and Whisker Plots
    Horizontal Box and Whisker Plots
    Simple Vertical Box Plots

    I've also designed a (commercial) box and whisker utility for Excel:
    PTS Box and Whisker Plot Utility

    ReplyDelete
  2. Nice examples Jon - thanks for the links.

    Just took a quick whiz through Excel 2010 - no box plots, so hopefully your commercial option will have validity for years to come..

    ReplyDelete
  3. A free Excel add-in offers in-cell boxplot (and other charts) :

    http://sparklines-excel.blogspot.com/

    Would be great to read what you think about it (especially when compared w/ Microsoft's toy)

    ReplyDelete
  4. @Fabrice: I love this add-in, and that's a great idea to do a comparison - maybe in the winner/loser format.

    ReplyDelete

Thank you for taking the time to read this blog and commenting.

ShareThis