Graph visualization basics pt3: building the visual data model

Earlier in this series, we walked through an introduction to graph data, and some of the visualization best practices you should keep in mind when designing your own graph application. You might have noticed that we skipped over a fundamental topic: visual data modeling. That’s what we’ll cover today.

Why is visual data modeling important?

Before we get started, we should clarify the difference between your graph model, and your visual data model.

When building a new database schema, a database engineer will carefully create and refine a data model. The process takes into account a range of different factors: the data entities and properties, query execution, database performance, scalability, and so on.

As a visualization engineer, you’ll follow a similar process to build a new visual data model. You’ll need to design a model that’s clear, clutter-free, and allows your users to answer their questions easily.

The data model and the visual data model are rarely the same, and that’s a good thing. Your data model is designed to work well for your database. Your visual model should be designed for your users, their data, and the questions they need to answer.

Interested in building your data model?

Check out our ultimate guide to modeling graph data

Let’s look at some examples.

Example 1: decide on your nodes and links

Let’s say we’re designing a healthcare data visualization. Our data model might include entities like doctors, patients, and appointments.

We could model this visually as:

visual data modeling: a simple visual model for some data

Simple enough. But when we load our data using this model into a graph visualization, the chart looks very busy:

visual data modeling
What our data looks like, using the visual model above

Instead of modeling our appointments as nodes, they could simply be links between patients as doctors – removing a third of the nodes from our chart. We can also size those links, based on the volume of appointments between a doctor and patient:

These simple visual model decisions remove clutter from the chart, and will probably be more intuitive for the user.

This model makes it easier for the user to answer simple questions, like ‘how many patients has a doctor seen?’ And ‘how many appointments has a patient made’, etc.

We can then use some of the styling options we covered in part 2, to add properties of our data into the visualization. For example, here we’re highlighting appointments about which a patient has raised a complaint using colored links:

Visual data modeling: Red links indicate a complaint made by a patient about an appointment
Red links indicate a complaint made by a patient about an appointment.

If visualizing these patient-doctor relationships was the sole purpose of your application, then we are good here. But you should always seek ways to enrich or simplify your model, depending on your users’ needs.

For example, we could remove the patient nodes and add links directly between doctors to show the number of shared patients. Or if we have information about doctors’ specialties, we could change the data model to show how patients are shared among fields.

Example 2: Combine and group

Let’s look at this very simple insurance claim example below.

Dan took his car to a garage owned by Jim. Jim lives with Fiona, who happens to work at TV World, with Dan
Dan took his car to a garage owned by Jim. Jim lives with Fiona, who happens to work at TV World, with Dan.

A fraud analyst is likely to be interested in the relationships between the people in this data, so we should design our visual model to highlight those links.

By collapsing intermediate nodes and links, we’re left with a simpler visual data model that shows the users the information they want to understand. We’ve used glyphs on the links to avoid losing any of the detail in our original visualization:

Collapsing links, replacing them with glyphs, simplifies the chart

We can simplify this visual model further with combos – visual groupings of nodes and links. Let’s group individuals registered at the same address:

Combos simplify our chart further.

Now our fraud analyst has a simple overview of the data, clearly showing two groups of people with potentially suspicious connections.

Example 3: Add your properties

In this final example, we’ll see how the careful use of visual styling allows you to add properties of your data in a way that adds context without overwhelming the user.

Here’s a visualization of a group of individuals, and how they interact with one another on social media.

Light blue nodes and links represent retweeted or liked tweets. Dark blue represents liked or shared Facebook posts. The individuals are represented by grey nodes
Light blue nodes and links represent retweeted or liked tweets. Dark blue represents liked or shared Facebook posts. The individuals are represented by grey nodes.

This is just a snapshot of everything in our database. Whilst it might be interesting, it’s cluttered and hard to understand all in one view. Let’s see how it can be improved.

The accounts, tweets and posts are useful but they don’t all need to be on the screen all the time. Let’s use combos again to combine individuals, their accounts, posts and tweets into a single node:

A basic rule of visualization is to only show as much detail as you need to uncover insight.

Doing this, we remove the extra ‘hops’ the connect the people in our network, without losing the data underneath.

The more granular information can still be accessed by expanding the combo.

We can use some other techniques to represent the data stored in collapsed links, including:

  • Glyphs to show the platforms used
  • Link sizing to show the volume of activity used on each platform
  • Donuts to show the relative use of each platform
Using donuts to show the relative use of each platform
Using donuts to show the relative use of each platform

With these small changes, we’ve transformed our data model into a clear and insightful visual model that lets users uncover insight more easily.

We’re here to help

Visual data modeling can be tricky, but it’s worth taking the time to get it right. The best way to get started is to try out your ideas in one of our graph visualization toolkits. Just request a trial to get started.

I’m a part of the Professional Services team, helping our customers get the best out of our graph visualization products. If you’ve got some specific issues you’d like help with, get in touch. I’d be happy to hear from you.

More from our blog

Visit our blog

Registered in England and Wales with Company Number 07625370 | VAT Number 113 1740 61 | 6-8 Hills Road, Cambridge, CB2 1JP. All material © Cambridge Intelligence 2020.