One of the most important lessons I’ve learned as a news developer is that there’s no right way to build a data visualization. But there are plenty of wrong ways.
Reporters frequently come to us with story ideas and actual datasets, if we’re lucky. It’s usually up to us to figure out how to best visualize the data, and the design process often continues long after our development work has begun.
Such was the case for our recent Seattle city council district interactive. The project featured a choropleth map inviting readers to compare the city’s seven newly-created districts across a variety of demographic measures, including age, income, and ethnicity.
The initial map was fairly easy to set up. For each demographic category, we colored the districts on a single-hue progression ranging from white (indicating that zero percent of the district’s population fell into that category) to purple (representing the maximum percentage, i.e. the district with the highest number of residents falling into that category). Most districts ended up somewhere in the middle, as in these views of the racial distribution of Seattle’s white (left) and black (right) populations:
These maps succeeded in providing some insight into the city’s racial makeup. It was clear, for instance, that District 2 (Rainier Valley) housed the majority of Seattle’s black population, whereas the rest of the city was consistently white.
There’s a limit to what differences the human eye can distinguish, however, and we were concerned that we were drowning out our data in a sea of subtly different shades of purple. We experimented with bumping up the contrast — first by intensifying the saturation of the maximum value (left), and then by decreasing the saturation of the minimum value (right):
Neither of these solutions left us happy. Both seemed to be misrepresenting the data, either by overplaying larger numbers or underplaying smaller, but not insignificant, ones.
In retrospect, the main problem with our initial approach was that it limited us to a single-hue spectrum defined by absolute maximum and minimum values. We had to apply the same color rules to multiple demographic views, and that meant that small variances between districts (i.e. number of people who bike to work, ranging from 2 percent to 6 percent) appeared blown out of proportion, while larger variances appeared washed out.
We decided to return to our original color progression, with the addition of a legend and some new styling:
We were still concerned that readers were going to have to work too hard to translate the map’s colors into numbers. It was easy enough, for example, to see that District 2 had the lowest number of white people (31 percent of the district’s population), but it was much more difficult to see that District 6 (Ballard) had the highest percentage (83 percent) — 16 percent higher than the citywide average.
In the week before publication, we buckled down and made a series of significant changes. The result was a map based on a two-tone color progression, indicating how each district stacked up to the citywide average for each demographic. We also switched out the legend boxes with a gradient scale to make it more readable:
This new approach addressed several of our concerns. Switching to a two-tone system made it much easier to identify small differences that fell just above or below average. Additionally, by centering the progression around an average value rather than scaling it from an absolute maximum, we were able to provide a more accessible, at-a-glance view of what the city looked like as a whole.
Reporters, editors, and developers all put their heads together to work out the best presentation for this map, and the final form didn’t materialize until fairly late in the process. Our efforts were well worth it. Fielding criticisms and suggestions at each stage of the design process allowed us to identify and slowly chip away at discrete problems, and resulted in a product that everyone was satisfied with.