EigenCentrality & PageRank

14th January, 2020

If you want to uncover the most influential, well-connected or important individuals in a network, you should turn to social network analysis centrality measures. These graph analysis algorithms are designed to unpick complex networks and reveal the patterns buried in the connections between nodes.

In this blog post, we’ll take a look at two centrality measures in our graph visualization toolkits: EigenCentrality and PageRank.

How do they work? When should you use them? Read on to find out.

EigenCentrality: understand network influence

EigenCentrality measures a node’s influence. It starts by measuring each nodes ‘degree’ score – which is simply a count of the number of links that node has to other nodes in the network. However, EigenCentrality goes a step further than degree centrality. It goes beyond the first-degree connections to count how many links their connections have, and so on through the network.

Our toolkits calculate each node’s EigenCentrality using the power iteration method. That means our algorithm generates random vectors and multiplies them through an adjacency matrix (a matrix summary of the connections between nodes) until the corresponding eigenvalue is found (or ‘converged’ upon).

What does EigenCentrality tell me?

A high EigenCentrality score indicates a strong influence over other nodes in the network. It is useful because it indicates not just direct influence, but also implies influence over nodes more than one ‘hop’ away.

EigenCentrality in action

Here’s a good example of EigenCentrality revealing node influence that would otherwise be hidden. In this visualization, we’re looking at around 1.6 million emails sent between Enron employees, published by the Federal Energy Regulation Commission:

Nodes sized by degree centrality
Nodes sized by EigenCentrality
Degree centrality (top) and EigenCentrality (bottom)

The first image shows nodes sized by degree (i.e. their number of links) which makes Bill look important as he’s sending a lot of emails to his 10-person team.

The second image sizes nodes sized by EigenCentrality. This view gives a more complete picture of Bill’s influence. His team is on the periphery of the wider Enron organization, with only one connection back to the wider network – via Timothy Belden, who himself is relatively disconnected from the network’s powerbase:

Bill's network on the periphery of Enron's network
Bill’s sub-network is clearly on the periphery of the wider Enron network

A node may have a high degree score (i.e. many connections) but a relatively low EigenCentrality score, if many of those connections are with other low-scored nodes.

Also, a node may have a high betweenness score (indicating it connects disparate parts of a network) but a low EigenCentrality score if it is distant from the centers of power in the network.

We can see that here with John Lavorato – he’s in the center of the network topologically, but lacks Tana Jones’ volume of connections to high powered nodes:

Tana has a high EigenCentrality score as she’s closer to the email network’s inner cluster of tightly-connected nodes than John

Want to learn more?

Our white paper has lots more detail about social network analysis, centrality measures and how to visualize social networks.

Download the White Paper

PageRank: the Google algorithm

Invented by Google founders Larry Page and Sergei Brin, PageRank is a variant of EigenCentrality designed for ranking web content, using hyperlinks between pages as a measure of importance. It can be used for any kind of network, though.

PageRank’s main difference from EigenCentrality is that it accounts for link direction. Each node in a network is assigned a score based on its number of incoming links (its ‘indegree’). These links are also weighted depending on the relative score of its originating node.

The result is that nodes with many incoming links are influential, and nodes to which they are connected share some of that influence.

What does PageRank tell me?

Like EigenCentrality, PageRank can help uncover influential or important nodes whose reach extends beyond just their direct connections. It’s especially useful in scenarios where link direction is important:

  • Understanding citations (e.g. patent citations, academic citations)
  • Visualizing IT network activity
  • Modeling the impact of SEO and link building activity

PageRank centrality in action

Let’s take a look at PageRank in action with the Enron corpus. We’ll follow one employee: Michael Grigsby. With no centrality measures applied, he looks pretty insignificant.

Network with no centrality sizing
With no centrality measures applied, Michael (highlighted in blue) gets lost in the network

Let’s see how he appears with EigenCentrality applied.

Network with EigenCentrality applied
The same network with EigenCentrality applied. Michael is the blue node to the right of center.

Michael’s low-volume links to other nodes mean he still looks relatively insignificant. Using PageRank, our view is transformed.

Network with PageRank applied
Applying PageRank highlights Michael – again, highlighted blue

Despite his limited connections, Michael balloons to one of the largest nodes in the network when PageRank is applied. He is one of the few nodes in the network receiving incoming links from highly influential nodes. This has pushed his PageRank score up significantly.

A quick Google confirms that Michael was VP of Natural Gas Trading – an important node in the network that we may not have identified with the other centrality measures.

Find the right centrality measure for the job

Understanding network dynamics and influence can be a game of trial and error. Different measures are better suited to certain scenarios or datasets.

Our toolkits offer a range of social network centrality measures, each designed to uncover different kinds of influence. Download our white paper to learn more.

Download the White Paper

This post was originally published some time ago. It’s still popular, so we’ve updated it with fresh content to keep it useful and relevant.

More from our blog

Visit our blog