Posted on Categories Coding, data science, Pragmatic Data Science, Statistics, TutorialsTags , , , , 3 Comments on Revisiting Cleveland’s The Elements of Graphing Data in ggplot2

Revisiting Cleveland’s The Elements of Graphing Data in ggplot2

I was flipping through my copy of William Cleveland’s The Elements of Graphing Data the other day; it’s a book worth revisiting. I’ve always liked Cleveland’s approach to visualization as statistical analysis. His quest to ground visualization principles in the context of human visual cognition (he called it “graphical perception”) generated useful advice for designing effective graphics [1].

I confess I don’t always follow his advice. Sometimes it’s because I don’t agree with him, but also it’s because I use ggplot for visualization, and I’m lazy. I like ggplot because it excels at layering multiple graphics into a single plot and because it looks good; but deviating from the default presentation is often a bit of work. How much am I losing out on by this? I decided to do the work and find out.

Details of specific plots aside, the key points of Cleveland’s philosophy are:

  • A graphic should display as much information as it can, with the lowest possible cognitive strain to the viewer.
  • Visualization is an iterative process. Graph the data, learn what you can, and then regraph the data to answer the questions that arise from your previous graphic.

Of course, when you are your own viewer, part of the cognitive strain in visualization comes from difficulty generating the desired graphic. So we’ll start by making the easiest possible ggplot graph, and working our way from there — Cleveland style.

Continue reading Revisiting Cleveland’s The Elements of Graphing Data in ggplot2

Posted on Categories Exciting Techniques, Expository Writing, Mathematics, Pragmatic Data Science, Pragmatic Machine Learning, StatisticsTags , , , , , , 7 Comments on Good Graphs: Graphical Perception and Data Visualization

Good Graphs: Graphical Perception and Data Visualization

What makes a good graph? When faced with a slew of numeric data, graphical visualization can be a more efficient way of getting a feel for the data than going through the rows of a spreadsheet. But do we know if we are getting an accurate or useful picture? How do we pick an effective visualization that neither obscures important details, or drowns us in confusing clutter? In 1968, William Cleveland published a text called The Elements of Graphing Data, inspired by Strunk and White’s classic writing handbook The Elements of Style . The Elements of Graphing Data puts forward Cleveland’s philosophy about how to produce good, clear graphs — not only for presenting one’s experimental results to peers, but also for the purposes of data analysis and exploration. Cleveland’s approach is based on a theory of graphical perception: how well the human perceptual system accomplishes certain tasks involved in reading a graph. For a given data analysis task, the goal is to align the information being presented with the perceptual tasks the viewer accomplishes the best. Continue reading Good Graphs: Graphical Perception and Data Visualization