Earlier in this series, we walked through an introduction to graph data, and some of the visualization best practices you should keep in mind when designing your own graph application. You might have noticed that we skipped over a fundamental topic: visual data modeling. That’s what we’ll cover today.
Why is visual data modeling important?
Before we get started, we should clarify the difference between your graph model, and your visual data model.
When building a new database schema, a database engineer will carefully create and refine a data model. The process takes into account a range of different factors: the data entities and properties, query execution, database performance, scalability, and so on.
As a visualization engineer, you’ll follow a similar process to build a new visual data model. You’ll need to design a model that’s clear, clutter-free, and allows your users to answer their questions easily.
The data model and the visual data model are rarely the same, and that’s a good thing. Your data model is designed to work well for your database. Your visual model should be designed for your users, their data, and the questions they need to answer.
Interested in building your data model?
Check out our ultimate guide to modeling graph data
Let’s look at some examples.
Example 1: decide on your nodes and links
Let’s say we’re designing a healthcare data visualization. Our data model might include entities like doctors, patients, and appointments.
We could model this visually as:
Simple enough. But when we load our data using this model into a graph visualization, the chart looks very busy:
Instead of modeling our appointments as nodes, they could simply be links between patients as doctors – removing a third of the nodes from our chart. We can also size those links, based on the volume of appointments between a doctor and patient:
This model makes it easier for the user to answer simple questions, like ‘how many patients has a doctor seen?’ And ‘how many appointments has a patient made’, etc.
We can then use some of the styling options we covered in part 2, to add properties of our data into the visualization. For example, here we’re highlighting appointments about which a patient has raised a complaint using colored links:
If visualizing these patient-doctor relationships was the sole purpose of your application, then we are good here. But you should always seek ways to enrich or simplify your model, depending on your users’ needs.
For example, we could remove the patient nodes and add links directly between doctors to show the number of shared patients. Or if we have information about doctors’ specialties, we could change the data model to show how patients are shared among fields.
Example 2: Combine and group
Let’s look at this very simple insurance claim example below.
A fraud analyst is likely to be interested in the relationships between the people in this data, so we should design our visual model to highlight those links.
By collapsing intermediate nodes and links, we’re left with a simpler visual data model that shows the users the information they want to understand. We’ve used glyphs on the links to avoid losing any of the detail in our original visualization:
We can simplify this visual model further with combos – visual groupings of nodes and links. Let’s group individuals registered at the same address:
Now our fraud analyst has a simple overview of the data, clearly showing two groups of people with potentially suspicious connections.
Example 3: Add your properties
In this final example, we’ll see how the careful use of visual styling allows you to add properties of your data in a way that adds context without overwhelming the user.
Here’s a visualization of a group of individuals, and how they interact with one another on social media.
This is just a snapshot of everything in our database. Whilst it might be interesting, it’s cluttered and hard to understand all in one view. Let’s see how it can be improved.
The accounts, tweets and posts are useful but they don’t all need to be on the screen all the time. Let’s use combos again to combine individuals, their accounts, posts and tweets into a single node:
Doing this, we remove the extra ‘hops’ the connect the people in our network, without losing the data underneath.
We can use some other techniques to represent the data stored in collapsed links, including:
- Glyphs to show the platforms used
- Link sizing to show the volume of activity used on each platform
- Donuts to show the relative use of each platform
With these small changes, we’ve transformed our data model into a clear and insightful visual model that lets users uncover insight more easily.
We’re here to help
Visual data modeling can be tricky, but it’s worth taking the time to get it right. The best way to get started is to try out your ideas in one of our graph visualization toolkits. Just request a trial to get started.
I’m a part of the Professional Services team, helping our customers get the best out of our graph visualization products. If you’ve got some specific issues you’d like help with, get in touch. I’d be happy to hear from you.