Visualizing the Tour de France in the year I tackle the route

On Saturday 24th June 2023, I set off from Bilbao, Spain, to ride the Tour de France route with Le Loop. In the lead up to this, my day job as a software developer gave me a break from hard training rides, but my love of cycling sparked a mini side project: building web apps with the data visualization tools I help to develop, and using them to analyze and visualize Tour de France data.

A photo of amateur cyclist Julia with her bike on day 1 of Le Loop with the northern Spanish coast in the background
Day 1 of Le Loop: a hilly 182km starting and finishing at Bilbao, Spain

Through custom filtering, styling and grouping techniques, I’ll use different Tour de France datasets to explore the history, legends, nationalities, locations and exciting 2022 finale in a visual way, finding insights in hidden connections.

Visualizing Tour de France race history

Keen to see what the entire history of Tour de France teams and riders looks like, I visualized thomascamminady’s LeTourDataset.

Graph visualization of Tour de France riders and teams between 1903-2022
Tour de France riders and teams between 1903-2022

Green bike nodes represent teams, and they’re sized by the number of times riders listed that team as the one they rode the tour for. The time bar across the bottom shows how race participation increased steadily, with two gaps showing when the race wasn’t held due to the first and second world wars.

It’s a busy chart, but I’m drawn to the cluster of larger team nodes in the top left.

Visualizing the Tour de France: a zoomed in look at the earliest teams and riders
Visualizing the Tour de France: the early years

Hmmmm. Those “TDF 190# ***” don’t look right – they’re clearly not teams – but I know what’s happened. Before 1930, any amateur rider could compete: they didn’t make team memberships compulsory until 1930. So in the original data, these individuals have the year they took part listed as their team name. (It’s a reminder that every source dataset can benefit from a review and clean-up exercise before visualizing it!)

If we look at the teams that did exist in the early years, we can see how the team strategy paid off. A closer look at the Peugeot-Wolber team performance in 1914 shows that, of the 11 riders in the team, eight finished in the top ten general classification race (GC), including the top three places.

A visualization of the Peugeot-Wolber team of 1914
Glyphs on the links shows where each rider finished in the GC race

The winner was Belgian rider Philippe Thys, who claimed victory despite incurring a 30-minute penalty for buying a new wheel to fix a mechanical problem on stage 14.

A bike representing the 1914 team linked to a rider representing Tour de France winner Thys
Links between riders and teams show the year they competed and their final GC ranking. Thys’s gold medal marks his 1st place.

Cyclists in the chart are sized according to a ‘success’ score based on the number of tours completed and their rankings. When I select Thys, details on the right-hand side show he competed six times and won three.

A graph visualization with an information panel on the right revealing details of Belgian rider, Philippe Thys
The information panel reveals which years Thys competed and where he finished in each Tour de France

Let’s look at what’s happened with more recent tours and teams. I use the time bar sliders to filter the data and show results from the last 20 years.

120 years of Tour de France race data filtered down to the most recent 20

One larger rider node in the bottom right corner stands out. Closer investigation reveals it’s British 4-times Tour de France winner, Chris Froome.

Selecting Chris Froome focuses on his connections in the chart and reveals details about his race history

Most cycling fans know that his victories came as part of Sky-sponsored teams, but he’s also linked to a third, much smaller team in this chart. I select it and see it’s Barloworld, a South African team that received wild card entries for the tour in 2007 and 2008.

The largest of the other nodes linked to this team is Froome’s future super domestique turned 2018 winner (and 2019 runner up), Geraint Thomas.

Exploring connections: Froome’s links lead me first to Barloworld and then Geraint Thomas

Tour de France legends

This visualization focuses on riders and stage wins. Each rider is linked to the type of stage won, which reveals a lot about how race specialists have evolved over the years.

A graph visualization of Tour de France stage winners connected to the types of stage they won
Tour de France riders linked to the stage categories they won

As you’d expect, British sprint specialist Mark Cavendish – the largest node in the cluster on the right of the chart – has only ever won flat stages (a massive 34 in total):

A graph visualization showing British sprinter Mark Cavendish linked to the flat stage category
Links are sized according to how many Tour de France stages each rider won
A customized node showing Mark Cavendish's 34 stage victories
Mark Cavendish’s current stage win record. He’ll compete in his last ever tour in 2023 in an attempt to take the all-time record for the most Tour de France stage wins.

The other large node in the chart shows that Belgian legend of the sixties and seventies – the great Eddy Merckx – has won the same number of stages. But he was as strong in the mountains as he was on the flat, so he’s linked to multiple stage types.

Eddy Merckx linked to the mountain, individual time trial and flat stage categories he won
Merckx’s all-round skills meant he won the Tour de France five times between 1968-74

Stage winners by nationality

I simplified the data further by grouping stage winners by their countries of origin. French nationals have won the most stages (over 700), although a Frenchman hasn’t won the GC since Bernard Hinault in 1985.

A graph visualization with nodes representing the number of stage winners from different countries
Nodes are sized by how many Tour de France stage winners there’ve been from each country

I can dig deeper inside these group nodes to see the riders from each country sized according to how many stages they’ve won. Here’s Belgium’s winners, prominently featuring our friends Eddy Merckx and Phillipe Thys:

A graph visualization with nodes representing the number of stage winners from Belgium
The more stages each Belgian rider won, the greater the node size

When I click on Eddy Merckx, I go deeper still and see details of each stage victory, including the year, stage number, and distance:

A graph visualization of Eddy Merckx's stage wins, color-coded by category
Eddy Merckx’s stage wins, color-coded by category

Location, location, location

Obviously, the Tour de France is no longer just a sporting event: it’s a huge commercial enterprise. European neighbors bid millions of euros to host the grand départ (opening race stages), while French towns and cities pay handsomely for the privilege of hosting a stage.

I’m interested to see what the hundreds of different locations that’ve been used as the start or finish of a stage look like. I’ve sized them according to how many times they’ve hosted.

Unsurprisingly, Paris – venue of the race finish every year since 1975 – is the largest node, followed by Bordeaux in the southwest and Pau on the edge of the Pyrenees.

A graph visualization showing Tour de France start and finish locations
Tour de France start and finish locations

The links show the direction between locations and I’ve sized them by how many times that stage route has featured. They’re also colored by stage type: green for flat, blue for time trials, red for hilly, orange for mountains.

Zooming in shows how many iconic mountain finishes there’ve been on Alpe d’Huez – a grueling 13.8km climb with an average 8.1% gradient and 21 hairpin turns.

A graph visualization showing the number of stages that have finished on Alpe d'Huez

The crescent of nodes on the right-hand side show outliers in our data that don’t link directly to the other, more popular stages. These are where grand départ locations outside France feature. We can zoom in to spot London, which hosted the start and finish of the 7.9km 2007 prologue time trial plus the start of a flat stage to Canterbury. In 2014, London also hosted the finish of a stage that started in my hometown, Cambridge.

The blue loop represents the 2007 race prologue – won by Swiss rider Fabian Cancellara – which started and finished in London

How the 2022 Tour de France was won

22 teams of eight riders set off from Copenhagen on July 1st 2022. I’ve arranged them here in club colors and grid formation to give an idea of the peloton’s size.

22 teams with 8 riders in each displayed in a grid formation
The 186 riders who started the 2022 race. Data from procyclingstats.com and letour.fr

As in 2021, the favorites for GC were from UAE Team Emirates and Team Jumbo-Visma. It was a fantastic battle with Jumbo-Visma’s Jonas Vingegaard from Denmark beating Slovenian UAE rider and 2020 champion, Tadej Pogačar.

Watch how the race between the two teams unfolded. You can see that Vingegaard’s Jumbo-Visma teammate Wout Van Aert took the yellow jersey early on before Pogačar (top left for UAE) won it from him. Vingegaard took it over from stage 11 and never relinquished it.

Progress bars below each rider show how they ranked for the points (green) and king of the mountains (red) races. The yellow background shows who’s the current GC leader.

You can see how UAE Team Emirates was decimated by COVID-19 (it also ended my 2022 Le Loop attempt on stage 8). Jumbo-Visma were also down to 5 riders by the end, but claimed GC, points race, and king of the mountains victories.

Chapeau to every rider

Building these web apps and exploring the Tour de France data in a visual way was great fun. It also shows how easy it is to explore large and complex datasets with interactive graph visualization.

I could take this exploration further – plot stages on a real map to spot geographic patterns, combine nationalities with mountain stage winners to see if riders from high altitude countries have an advantage, or include financial data to reveal correlations between team success and investment. If you want to learn more about what our toolkits can do, get in touch.

Right now I’ll enjoy my own Tour de France adventure before watching the final week of the professional race from the comfort of my sofa.

A photo of Julia cycling stage 5 of Le Loop in the Pyrenees with cows and clouds in the background
A selfie from Le Loop’s stage 5 – a 162.7km mountain stage from Pau to Laruns
A screen showing a hybrid graph and timeline visualization created using ReGraph and KronoGraph
FREE: Start your trial today

Visualize your data! Request full access to our SDKs, demos and live-coding playgrounds.

TRY OUR TOOLKITS

How can we help you?

Request trial

Ready to start?

Request a free trial

Learn more

Want to learn more?

Read our white papers

“case

Looking for success stories?

Browse our case studies

Registered in England and Wales with Company Number 07625370 | VAT Number 113 1740 61
6-8 Hills Road, Cambridge, CB2 1JP. All material © Cambridge Intelligence 2024.
Read our Privacy Policy.