Читать книгу The Big Book of Dashboards - Shaffer Jeffrey - Страница 9
PART I
A STRONG FOUNDATION
Chapter 1
Data Visualization: A Primer
Color
ОглавлениеColor is one of the most important things to understand in data visualization and frequently is misused. You should not use color just to spice up a boring visualization. In fact, many great data visualizations don't use color at all and are informative and beautiful.
In Figure 1.15, we see Shine Pulikathara's visualization that won the 2015 Tableau Iron Viz competition. Notice his simple use of color.
Figure 1.15 Winning visualization by Shine Pulikathara during the 2015 Tableau Iron Viz competition.
Source: Used with permission from Shine Pulikathara.
Color should be used purposefully. For example, color can be used to draw the attention of the reader, highlight a portion of data, or distinguish between different categories.
Use of Color
Color should be used in data visualization in three primary ways: sequential, diverging, and categorical.
In addition, there is often the need to highlight data or alert the reader of something important. Figure 1.16 offers an example of each of these color schemes.
Figure 1.16 Use of color in data visualization.
Sequential color is the use of a single color from light to dark. An example is encoding the total amount of sales by state in blue, where the darker blue shows higher sales and a lighter blue shows lower sales. Figure 1.17 shows the unemployment rate by state using a sequential color scheme.
Figure 1.17 Unemployment rate by state using a sequential color scheme.
Diverging color is used to show a range diverging from a midpoint. This color can be used in the same manner as the sequential color scheme but can encode two different ranges of a measure (positive and negative) or a range of a measure between two categories. An example is the degree to which electorates may vote Democratic or Republican in each state, as shown in Figure 1.18.
Figure 1.18 Degree of Democratic (blue) versus Republican (red) voter sentiment in each state.
Diverging color can also be used to show the weather, with blue showing the cooler temperatures and red showing the hotter temperatures. The midpoint can be the average, the target, or zero in cases where there are positive and negative numbers. Figure 1.19 shows an example with profit by state, where profit (positive number) is shown in blue and loss (negative number) is shown in orange.
Figure 1.19 Profit by state using a diverging color scheme.
Categorical color uses different color hues to distinguish between different categories. For example, we can establish categories involving apparel (e.g., shoes, socks, shirts, hats, and coats) or vehicle types (e.g., cars, minivans, sport utility vehicles, and motorcycles). Figure 1.20 shows quantity of office supplies in three categories.
Figure 1.20 Quantity of office supplies in three categories using a categorical color scheme.
Highlight color is used when there is something that needs to stand out to the reader, but not alert or alarm them. Highlights can be used in a number of ways, as in highlighting a certain data point, text in a table, a certain line on a line chart, or a specific bar in a bar chart. Figure 1.21 shows a slopegraph with a single state highlighted in blue.
Figure 1.21 Slopegraph showing sales by state, 2014–2015, using a single color to highlight the state of Washington.
Alerting color is used when there is a need to draw attention to something for the reader. In this case, it's often best to use bright, alarming colors, which will quickly draw the reader's attention, as in Figure 1.22.
Figure 1.22 Red and orange indicators to alert the reader that something on the dashboard needs attention.
It is also possible to have a categorical-sequential color scheme. In this case, each category has a distinct hue that is darker or lighter depending on the measurement it is representing. Figure 1.23 shows an example of a four-region map using categorical colors (i.e., gray, blue, yellow, and brown) but at the same time encoding a measure in those regions using sequential color; let's assume that sales are higher in states with darker shading.
Figure 1.23 Sales by region using four categorical colors and the total sales shown with sequential color.
Color Vision Deficiency (Color Blindness)
Based on research (Birch 1993), approximately 8 percent of males have color vision deficiency (CVD) compared to only 0.4 percent of females. This deficiency is caused by a lack of one of three types of cones within the eye needed to see all color. The deficiency commonly is referred to as “color blindness”, but that term isn't entirely accurate. People suffering from CVD can in fact see color, but they cannot distinguish colors in the same way as the rest of the population. The more accurate term is “color vision deficiency.” Depending on which cone is lacking, it can be very difficult for people with CVD to distinguish between certain colors because of the way they see the color spectrum.
There are three types of CVD:
1. Protanopia is the lack of long-wave cones (red weak).
2. Deuteranopia is the lack of medium-wave cones (green weak).
3. Tritanopia is the lack of short-wave cones (blue). (This is very rare, affecting less than 0.5 percent of the population.)
CVD is mostly hereditary, and, as you can see from the numbers, it primarily afflicts men. Eight percent of men may seem like a small number, but consider that in a group of nine men, there is more than a 50 percent chance that one of them has CVD. In a group of 25 men, there is an 88 percent chance that one of them has CVD. The rates also increase among Caucasian men, reaching as high as 11 percent. In larger companies or when a data visualization is presented to the general public, designers must understand CVD and design with it in mind.
The primary problem among people with CVD is with the colors red and green. This is why it is best to avoid using red and green together and, in general, to avoid the commonly used traffic light colors. We discuss this issue further in Chapter 33 and offer some solutions for using red and green together.
Seeing the Problem for Yourself
Let's look at some examples of how poor choice of color can create confusion for people with CVD.
In Figure 1.24, the chart on the left uses the traditional traffic light colors red, yellow, and green. The example on the right is a protanopia simulation for CVD.
Figure 1.24 Bar chart using the traffic light colors and a protanopia simulation. Notice the red and green bars in the panel on the right are very difficult to differentiate from one another for a person with protanopia.
One common solution among data visualization practitioners is to use blue and orange. Using blue instead of green for good and orange instead of red for bad works well because almost everyone (with very rare exceptions) can distinguish blue and orange from each other. This blue-orange palette is often referred to as being “color-blind friendly.”
Using Figure 1.25, compare the blue/orange color scheme and a protanopia simulation of CVD again.
Figure 1.25 Bar chart using a color-blind-friendly blue and orange palette and a protanopia simulation.
The Problem Is Broader Than Just Red and Green
The use of red and green is discussed frequently in the field of data visualization, probably because the traffic light color palette is prevalent in many software programs and is commonly used in business today. It is common in Western culture to associate red with bad and green with good. However, it is important to understand that the problem in differentiating color for someone with CVD is much more complex than just red and green. Since red, green, and orange all appear to be brown for someone with strong CVD, it would be more accurate to say “Don't use red, green, brown, and orange together.”
Figure 1.26 shows a scatterplot using brown, orange, and green together for three categories. When applying protanopia simulation, the dots in the scatterplot appear to be a very similar color.
Figure 1.26 Scatterplot simulating color vision deficiency for someone with protanopia.
One color combination that is frequently overlooked is blue and purple together. In a RGB (red-green-blue) color model, purple is achieved by using blue and red together. If someone with CVD has issues with red, then he or she may also have issues with purple, which would appear to look like blue. Other color combinations can be problematic as well. For example, people may have difficulty with pink or red used with gray or gray used together with brown.
Figure 1.27 shows another scatterplot, this time using blue, purple, magenta, and gray. When applying deuteranopia simulation, the dots in the scatterplot appear to be a very similar color of gray.
Figure 1.27 Scatterplot simulating color vision deficiency for someone with deuteranopia.
It's important to understand these issues when designing visualizations. If color is used to encode data and it's necessary for readers to distinguish among colors to understand the visualization, then consider using color-blind-friendly palettes. Here are a few resources that you can use to simulate the various types of CVD for your own visualizations.
Adobe Illustrator CC. This program offers a built-in CVD simulation in the View menu under Proof Setup.
Chromatic Vision Simulator (free). Kazunori Asada's superb website allows users to upload images and simulate how they would appear to people with different form of CVD. See http://asada.tukusi.ne.jp/webCVS/
NoCoffee vision simulator (free). This free simulator for the Chrome browser allows users to simulate websites and images directly from the browser.