Choosing colors for your data visualization

1st November, 2016

Colors can make or break your visualization.

A carefully selected color palette helps you to harness the pre-attentive processing powers of the human brain, and makes insight clearer and easier to find. A badly chosen color palette obscures the information your users need to understand, and makes your graph visualization less effective and harder to use.

This blog post explores basic color theory, and explains how it can help you design visualizations that look good, and make data more compelling.

Which color palette do you prefer? The one on the left breaks several basic rules.
Which color palette do you prefer? The one on the left breaks several basic rules.

About color theory

Color is a highly subjective topic. Reactions to individual colors will vary between people and cultures. Color theory, on the other hand, is an advanced and evidence-based science that can teach us a lot.

For this blog post, we’ll focus on one color theory concept: the HSL model.

HSL breaks color down into three separate channels: hue, saturation and luminance. We’ll use this example to explain further:

HSL model taken from Affinity.
HSL model taken from Affinity Designer.
  • Hue – is what most people think of as color – red, blue, yellow, green, purple, etc. Each color is plotted on a scale from 0° to 359° to form a color wheel.
  • Saturation – is another word for a color’s intensity. The scale measures how different the color looks from neutral gray, which has 0% saturation. Colors with high saturation look brighter and more vivid. In the example above, you can see how saturation increases towards the bottom right corner of the triangle.
  • Luminance – describes the spectrum of a hue from dark, based on the amount of black added. In our example above, luminance increases towards the bottom right left of the triangle.

With these three measurable channels, we can start to generate rules for selecting color palettes.

Let’s walk through a step-by-step process for enhancing your visualization with color.

How to choose colors for your visualization

Step 1: Decide what the colors will represent

This may be obvious, but your first step is to decide which aspect of your data you want to represent with color.

In a network of email accounts, for example, each node could have multiple attributes:

  • Name
  • Email address
  • Number of emails sent
  • Number of emails received
  • Centrality score (measures how well connected the account holder is)

Realistically, only one of these attributes can be tied to color. It is up to you to choose one that your users can understand quickly, and will lend itself to a color scale.

You can represent individual datasets in many different ways, and color is just one of the tools available to you. When you’re designing your visual styles, think about color alongside other options like labeling, glyphs, node sizing, edge weighting, etc.

Step 2: Understand your data scale

Once you’ve chosen an attribute to apply a color palette to, you need to decide which scale to use. Different scales require different types of palette.

The superb Color Brewer tool defines three types of scale:

  • Sequential – when data values go from low to high, e.g. centrality score values that range between from 0 to 1.
  • Divergent – when data has data points at both ends of the scale, with an important pivot in the middle. For example, the net flow of emails shows who sends more emails than they receive, or vice versa.
  • Qualitative – when our data does not have an order of magnitude. In our email example, the Name attribute is qualitative data because it doesn’t have a numerical value.

Step 3: Decide how many hues you need

Based on the scale you chose in step 2, you can decide how many hues you need in the palette:

  • Sequential data usually requires one hue, using luminance or saturation to define scale. It can be hard to recognize subtle changes in both saturation and luminance, so if you need to represent a scale containing five or more data points, you might want to use two hues.
  • Divergent data requires two hues, decreasing in saturation or luminance towards a neutral (usually white, black or gray).
  • Qualitative data requires as many hues as values, but bear in mind the limitations of the human brain when using multiple colors. Use more than seven or eight colors and the brain struggles to recall what each one represents. Use more than 12 and the brain also struggles to differentiate between them.

Step 4: Look for obvious options

Before getting too creative, take a look at your data and see whether there’s an obvious set of colors.

Your application or corporate style guide might be a good starting point. If you don’t have one of those, see if there are any color sets your users are likely to understand without a legend.

Take this visualization for example looking at partisanship in the US House of Representatives. Both the Democrat and Republican parties have established colors – blue and red. These are the hues most likely to be understood by the visualization users and are easily distinguishable.

Voting patterns in the House of Representatives
Taken from http://www.mamartino.com/projects/rise_of_partisanship/
.

Step 5: Create your palette

Now you know how many hues are required, you can do the difficult bit: create a palette.

In most cases, your best option is to use one of the many excellent web resources. Again, ColorBrewer is one of the finest, allowing you to pick schemes for sequential, diverging and qualitative data. Or if you have a starting point in mind, Adobe Color creates palettes from a single color.

There are several groups of colors that work well together. You can identify them by their relative positions on the color wheel:

  • Monochromatic – shades of a single hue, ideal for sequential data.
  • Analogous colors – colors that sit beside each other on the color wheel. These provide a more varied alternative for sequential data visualization.
  • Complementary colors – from opposite sides of the color wheel. When paired with a neutral (e.g. white or gray) these palettes are perfect for diverging data.
  • Triadic colors – 3 colors equally spaced around the wheel, which are a good starting point for a qualitative palette.
A monochromatic palette of ‘KeyLines blue’ created using Adobe Color.
A monochromatic palette of ‘KeyLines blue’ created using Adobe Color.

If you decide not to use one of these tools, you should at least follow this basic advice:

  • Only use complementary colors for 2-hue palettes. Users may find palettes with multiple complementary colors confusing.
  • Avoid using highly saturated colors. This will overwhelm the chart and make it difficult to find other visual elements, e.g. glyphs or halos. Stick with softer palettes.
  • Avoid colors with low saturation and high luminosity. Unless your chart has a dark background, they won’t be easily visible.
  • Remember that 1 in 10 men and 1 in 100 women are red-green color blind. Choose colors with different saturation values to make sure users can differentiate between colors regardless of their hue. There are several helpful online tools to test the accessibility of your visualizations.

Step 6: Convert to RGB

By now you should have a beautiful palette of colors for your visualization. Nice work!

There is one final task you need to do: convert your HSL values to RGB.

Colors in KeyLines can be specified in several formats, including the 17 CSS standard named colors, hexadecimal (or shorthand hexadecimal), and RGB. You can do this conversion using an online tool, or programmatically using a simple JavaScript formula.

Find out more

We’ve barely scratched the surface of color in this post, but it’s enough to get you started. If you want to learn more, see:

| |

Subscribe to our newsletter

Get occasional data visualization updates, stories and best practice tips by email