I sea what you did there

Picking your cohort visualization

Or, 'When Area Charts Mislead'

I sea what you did there

 

 

Slideshow intro

When I wrote my first post about measuring gamification, I used stacked area charts to visualize cohort analysis. Then, I came across this post by someone smarter. Here's what I learned.

Stacked area

cohorts plotted as an area graph

Here, I intended to show that while the value being measured is basically level overall, the desirable cohorts within it are trending drastically downward. As such, it gets the job done, but there's an important reason this isn't the right graph for real-world applications.

Stacked area, as perceived

the area graph showing why we naturally misread it

The issue is that our eye tends to compare the thickness of strata at a diagonal based on the apparent slope. This is exacerbated by the fact that our datapoints are pretty far apart in terms of pixels. This on-the-bias bias might be okay for real-time APM products with their closely-spaced samples, but not for general business metrics.

Stacked Column

cohorts plotted in stacked columns

Stacked columns cut out the bias. It's easy to scan and see that the top cohort is growing as a proportion of the total. While the data hasn't changed, the graph is more accurate… but, it seems to lack a sense of velocity.

Column

cohorts plotted in columns

Switch to a non-stacked column graph (and decrease the maximum value on the Y-axis) and here's what you get. To my eyes, this immediately conveys velocity—and highlights outliers—while avoiding the visual bias of the area chart. But, you might lose a sense of the overall contour of how these columns sum.

Conclusion

There are some other options that could convey velocity, outliers, contour of the sum, and performance of each cohort, without suffering the bias of the stacked area chart. For instance:

  • a multiple-column chart for the cohorts (like my last example) overlaid with a line chart to show the overall trend
  • a column chart showing the total, decorated with inset pie charts to show the cohorts (though pie charts have their own issues)
  • or some kind of stockchart

I may cover some of these options in future posts. But the bottom line is that I will be avoiding the sexy stacked area chart for my cohort visualizations.