Using graph theory to predict the FIFA World Cup 2022 winner

After our successful prediction of the 2018 FIFA World Cup winners the sensible thing to do would have been to quit while we were ahead.

But where’s the fun in that?

The World Cup is back for 2022, and we’ve made a prediction on who’s going to win based on the quality of the teams using only the shape of the network they make with other teams and clubs.

Graph visualization featuring every player and team at the FIFA World Cup 2022
Which team will win? Visualizing every player at the FIFA World Cup 2022

The graph model

Skip to the end for the prediction, but first, a reminder of the approach. It’s truly an amazing result of graph theory that with just a little bit of information about the connections between things, you can make guesses about the things themselves. Let’s see how it’s done.

The official World Cup squads were announced on 14 November 2022, and published on Wikipedia.

The page contains data which you might use to measure the ‘quality’ of each player, including how many goals they’ve scored and how many international appearances they’ve made. But we’ll ignore this because I want to attempt to prove just how much we can do with the connections alone.

Instead, I’ll build a graph of football clubs linked to countries. If a club has a player who’s representing their country at the 2022 finals, then we draw a link between those nodes. For example if a Manchester United player is on the French squad, there would be a link between the France node and the Manchester United node:

A link between nodes labeled Manchester United and France

That’s it. It sounds hard to believe that we can make a prediction from this. We have no information about which teams or countries are more successful than others – we don’t even have the players themselves as nodes in our graph – they are simply ‘projected’ onto the links. Two teams are linked if they have a player in common.

The world is full of these small connections – every time you send an email, make a phone call or even walk past a wireless router with your phone in your pocket, connections are made. They seem innocuous on their own, but when you weave them into a network, and use graph theory to analyze that network, amazing patterns emerge.

The World Cup prediction visualization

I load this graph into one of our graph visualization toolkits (KeyLines or ReGraph – either will work) and add a few color choices. For fun, I style the clubs as soccer balls using the “cut-out image” feature. I’ll keep countries as simple text nodes.

Every FIFA World Cup 2022 squad visualized

The width of the links reflects the number of players which make up the link. For example, the majority of Saudi Arabia’s players play for the same two clubs, shown with thicker lines:

Link widths is just one of many customizable elements of the chart

To make the visualization more interactive, I’ve set up a rule so that when I click a node, the chart animates to a new view showing just that node and its neighbors. Top tip – to avoid ugly starbursts where one heavily connected node dominates the chart, remove the links completely when you’re showing an ‘egocentric’ view like this.

How to make the prediction

You may be wondering why some nodes are larger than others. They’re sized according to the eigenvector centrality (or eigencentrality) score of the node – a measure from graph theory about how important a node is in the network. If you’re new to graph theory, check out Social network analysis 101: centrality measures explained.

Teams like Costa Rica sit at the edge of the network and have low eigencentrality, because most of their players play for clubs that don’t share players with other countries.

Chart showing Costa Rica at its periphery

But other teams sit right at the center of the graph – they have a higher eigencentrality score, meaning that their players play for clubs which are more international and boast more players from other countries who, in turn, are also well-connected.

Highly-connected nodes at the center of the graph visualization

In 2018 we used this eigencentrality score to make a prediction about the top five teams to watch. Not only did three of those five teams make it to the semi-finals, our top scoring team, France, went on to win the trophy.

And our prediction is…

Well, if you’re French, the good news is that your team is still top of the ranking in 2022!

Here’s our top five predictions for this competition and the previous one. Teams in bold were eventual semi-finalists.

Eigencentrality rank 2022 predictions 2018 results
1 France France
2 Brazil Belgium
3 Croatia Germany
4 Argentina Argentina
5 Denmark Croatia

So there you have it. Graph theory tips France to win a successive World Cup trophy this year. And Brazil and Denmark could be sides to watch – they’ve shot up the rankings since last time.

Try graph theory techniques on your data

Although predicting sporting tournaments isn’t something we recommend you take too seriously, this kind of analysis is crucial in a huge variety of applications, from security and intelligence to fraud investigations.

If you’d like to build similar applications, contact us for a free trial of our toolkits.

A screen showing a hybrid graph and timeline visualization created using ReGraph and KronoGraph
FREE: Start your trial today

Visualize your data! Request full access to our SDKs, demos and live-coding playgrounds.

TRY OUR TOOLKITS

How can we help you?

Request trial

Ready to start?

Request a free trial

Learn more

Want to learn more?

Read our white papers

“case

Looking for success stories?

Read our case studies

Registered in England and Wales with Company Number 07625370 | VAT Number 113 1740 61
6-8 Hills Road, Cambridge, CB2 1JP. All material © Cambridge Intelligence 2022.
Read our Privacy Policy.