Visualizing very large networks: an update

6th December, 2016

One of the most popular blog posts we’ve published in the last five years is Corey Lanum’s How to visualize very large networks. It was written to help answer our most frequently asked question: “We have 5 million nodes and links, how can I load them all into KeyLines?”

5000 nodes and 5000 links in a KeyLines network visualization chart
5000 nodes and 5000 links: loading huge networks will overload your users and not help them find insights.

In reality, visualizing so much data in one screen is rarely useful or successful, so Corey described helpful strategies for managing this.

The main strategies were:

  • Group and merge nodes and links to reduce the density of the network you’re trying to visualize.
  • Use filters or centrality measures to visualize data one section at a time.

Since the post was published in 2014, we’ve released 10 versions of KeyLines and added lots of functionality. We now support four new techniques to help you visualize large networks.

1. Filter by time or geography

Filters are ideal for simplifying networks to find hidden structures and outliers. They provide a simple and intuitive way for users to choose which parts of the network they want to see.

KeyLines v2.0 and v2.7.1 introduced two new ways to filter: by time and by geography.

If your data has a timestamp, the time bar lets users select subsets of data by time frame, or pan through a time range to identify interesting activity in the network:

Quickly navigating into a spike in network activity
Quickly navigating into a spike in network activity

The ‘Tweak’ layout, which uses the force-directed model, incrementally adapts itself as links are formed and broken, making it easy to see how the network evolves.

Almost all network data has a geographic element. KeyLines Geospatial lets users filter map-based data so they can analyze one location at a time:

Combining geospatial and time-based visualization gives users two powerful ways to filter large data sets.
Combining geospatial and time-based visualization gives users two powerful ways to filter large data sets.

2. Expand outwards

Another technique to avoid overloading your chart with too much data is to start with a small network and allow your users to work their way outwards.

The KeyLines expand feature supported this way of working already, but we improved expand in KeyLines v3.2 when we added an incremental mode (previously only available for the Standard Layout) to hierarchy, tweak and radial layouts:

Expanding a simple network using the hierarchy layout in incremental mode
Expanding a simple network using the hierarchy layout in incremental mode

When a user expands an item, incremental mode fixes existing network nodes in position and adds new nodes around them. It stops KeyLines from repositioning every node and rearranging the network with every chart layout.

3. Use an efficient layout

KeyLines’ layouts are optimized to combine efficiency with effectiveness, but the best option for large networks is the Lens Layout, first available in KeyLines v2.11.

The KeyLines Lens Layout
The KeyLines Lens Layout

The main advantages are:

  • load time is shorter than more complex layout algorithms
  • nodes are positioned across the canvas, avoiding overlap, which makes patterns clearly visible.

We still recommend that users adopt the strategies we’ve suggested for exploring subsets of data, but if you do need to load a large network, the Lens layout is your best option.

4. Deploy WebGL

There are times when a user needs to load thousands of nodes at once, for example, when they’re investigating automatically generated data like SIEM alerts.

Our WebGL graphics renderer, released in KeyLines 3.0, was designed to meet this challenge.

WebGL harnesses your device’s GPU for dramatically improved drawing performance. As a result, you can visualize vast networks with full interactivity.

This example shows 4000 domestic flight routes in the US, but we have no performance issues when panning, zooming, selecting or changing layout.

Compared to our original HTML5 Canvas renderer, the WebGL component is much faster:

HTML5 Canvas slows to around 3 frames per second with 100,000 elements. WebGL maintains a smooth 60+ frames.

Some other tips

Corey’s original advice still stands:

  • Filtering and Combos are excellent ways to de-clutter the chart (a function in KeyLines v3.2 made it even easier to combine these two methods)
  • Centrality measures can indicate nodes and subnetworks worth your focus
  • A progress bar is essential to reassure users working with large networks

But over the course of the past 10 releases, KeyLines has cemented its position as the most powerful network visualization technology, capable of visualizing the largest networks.

Put it to the test! Sign up for a free trial of the KeyLines SDK.

Try KeyLines

| | | | |

Subscribe to our newsletter

Get occasional data visualization updates, stories and best practice tips by email