20 January 2010

Data visualization challenge results, part one: addressing the data


In November I entered a data visualization challenge on Chandoo's excellent Excel and charting blog. I was honored to be voted as having the winning entry (by one vote) - I thought it would be useful to describe the steps I took in designing my entry.

Chandoo provided us with a data file containing two years of raw revenue data for four sales people with information on sales per region, product, and size of company sold to.

Chandoo created the data to show steady growth over the two years, with a little bit of randomness thrown in. As this didn't perhaps allow for an interesting story we were allowed to change the data, but not to add new columns of data (e.g. profit, expenses). Generating data to tell a story is harder than is sounds, especially when you want there to be reasons why one person wasn't performing as well. While this seems a little involved for an online competition, it was very similar to the thinking you would have to do anyway when designing dashboards -  "of the available data, what is the most important to show to the specific end-user for this situation, what can be compared to what to give insight into these data?"

Real data would usually reflect these insights, so I felt I had to make the data more real. I started with stories about the sales, for example: like most sales data, it fluctuates on an X month cycle as sales targets are set and deadlines approach, all the sales people suffered late 2008 due to the recession, some experienced a more abrupt drop in sales, some recovered quicker, Chewbacca (seriously) sells more in the East region, but across all company sizes, Luke Skywalker sells across the country, but mostly to the larger customers.

You can see what I did here (xls 2003) to create the data (second sheet, at the bottom). I created data for the regions that described an overall pattern. From there I created modifiers based on who sold what where, and placed a variable that influenced how strongly these modifiers altered the sales data. With the addition of the Index function, multiple if statements, and randomness thrown in, the data was created. It certainly isn't elegant - I'm sure that it could have been much more concise and still told the stories I wanted, but it worked. Next up: designing the dashboard for the end-user.

1 comment:

  1. Thanks for sharing! I've taken the liberty to upload this dataset to Visualize Free so people can create their own interactive dashboards from it. http://visualizefree.com/show.jsp?id=QCgqPUrE

    Here's an example dashboard I threw together:
    http://visualizefree.com/share.jsp?id=ZyBtRIoa
    Browse some of the bookmarks I made.

    ReplyDelete

Thank you for taking the time to read this blog and commenting.

ShareThis