KronoGraph lets you add interactive timeline analytics to your applications so you can reveal hidden patterns in large datasets. In this blog post, we combine this technique with location mapping to visualize GPS data, using the fascinating Geolife GPS trajectory dataset from Microsoft Research Asia.
The dataset was collected by 178 volunteers in China over a period of 4 years. Using GPS loggers and cell phones, the volunteers had their GPS locations captured in a time-stamped dataset of over 17,000 trajectories.
This is a perfect dataset to put our ‘pattern of life’ analytics to the test. What does typical behavior look like? What do a person’s movements reveal about them? How do they get around? Who do they meet, and where?
Processing the data
A trajectory is a sequence of GPS points that maps the movement of each object. Most of the trajectories in the dataset have an extremely dense representation – a sample point every 1-5 seconds.
We’ve blogged before about working with big graph data visualizations, and how important it is to remove as much noise as possible, as early as possible.
Since we’re only interested in approximate behaviors, my first step was to downsample the data by a factor of 100. This captures the positions of the users every couple of minutes instead of every second. It reduces the size of the data considerably, but still gives us plenty of behavioral patterns to investigate. To reduce the size further, I picked a subset of 35 volunteers.
The result is a dataset of around 76,000 events – more than enough to spot interesting patterns.
Loading GPS data using Leaflet
Before trying to build a timeline, I need to load these data points onto a map.
I use the simple Leaflet.heat library, one of many great Leaflet plugins available. Loading the 76,000 points onto my Leaflet map created a satisfying snapshot of each volunteer’s movements around Beijing and the surrounding provinces of China.
Building a timeline in KronoGraph
The next step is easy – I need to model the data for my timeline visualization.
Each of the 35 volunteers becomes an entity in KronoGraph terms, and each of the sampled geolocations becomes an event. I load them into my timeline and here’s my first view of the temporal data:
KronoGraph automatically presents a heatmap view because it’s the most sensible way to present all of the data at once. I can see that the bulk of activity took place between October 2008 and July 2009, but a few people signed up to the program at other times.
I can now switch on KronoGraph’s scale wrapping feature. This wraps a particular duration in the selected timeline scale, making clear how activities are distributed and revealing patterns at a glance.
Here’s what I see when I choose a daily view:
This view immediately reveals two kinds of volunteer. Those with black (empty) bands during the late and early hours, such as ‘Person 027’, clearly opted to switch off their tracking device overnight. It gives them more privacy, but it means we’re not going to learn as much from their datasets as we would from those with their trackers on full time, such as volunteer #4.
It’s possible that what makes these types of volunteers different is that one used custom GPS trackers while the others used cell phones, as mentioned in the publication website. Perhaps one device was easier to disable than the other?
Building an integrated, interactive KronoGraph app
To explore my data fully, I need to build a simple app that integrates Leaflet with KronoGraph. I want to be able to:
- draw a box around areas on the map and see a timeline of the activity that took place in that area, and
- draw a time range around part of my timeline, and filter the map view to show just the activity from that time period
Here’s the result – a simple, interactive UI that will help us dig deeper into the dataset:
I made the KronoGraph window smaller so the map is the focal point of my app.
When I visualize GPS data as a timeline, notice how KronoGraph has grouped volunteers together. It means I can still see a useful heatmap summarizing the behavior of them all, even though there isn’t as much screen real estate available.
Now let’s test out the interactivity.
I’m interested in the dense cluster of data points around Tsinghua University in the northwest of the city, so I’ll take a closer look. I draw a marquee over the map (I used the handy Leaflet-Area-Select tool for this) which makes the KronoGraph timeline update smoothly to visualize GPS data that focuses on the selected volunteer activity.
With my scale wrapping set to daily, I immediately see from my timeline that volunteers #4 and #30 spend a lot of time in this area. It’s likely that #4 is in residence at the university because their activity is pretty uniform over a 24 hour period.
New Year travel – spotting unusual activity
So far we’ve used KronoGraph to understand typical behaviors in data. But we can also use it to spot unusual activity and understand it better.
Chinese New Year in 2009 took place on 26 January. I’ve assumed that volunteers might be visiting their family homes then, so I zoom in to this date on the timeline. Notice the bright spots on the map in several other cities around China where some volunteers have traveled to:
It’s fun to visualize GPS data for one volunteer (I’ll pick our friend Volunteer #4) as they make the train journey across the country. I use a clever KronoGraph feature to focus on an individual timeline, then with the scale set to hourly, I slide the KronoGraph window forwards and backwards in time:
Geofencing for in-depth analysis
We’ve learned a lot about Volunteer 4 by visualizing their GPS data and exploring their timeline. We believe they live at Tsinghua University during term time, we’ve followed them to Dalian for New Year, and we know that they tend to leave their tracking device on all night.
Let’s dig deeper and try to find out where they spent their New Year break. First I geofence the entire city of Dalian and switch my scale back to daily:
Notice that volunteer #4 isn’t the only person to visit Dalian (more on tracking multiple people next). But as I hoped, #4 left their phone switched on overnight during their vacation, so let’s zoom in to see where they were at 2am:
Sure enough, the majority of volunteer #4’s data points in the early hours of the morning in this region are centered on a residential building in the northwest of the city. It’s a good assumption that this is where they stayed on their New Year break – perhaps a family home?
Co-location and convoy analysis
A common use case in law enforcement and intelligence is to apply this kind of activity-based intelligence methodology to problems of co-location. In other words, can we identify when two or more individuals were in the same place at the same time? Or perhaps when two or more individuals were moving together over a period of time, known as convoy analysis.
Since we have all the GPS coordinates and timestamps, spotting co-locations is straightforward. And using KronoGraph’s support for long-duration events which connect multiple entities together, it’s easy to highlight these situations both on the map and on the timeline.
Here, I can see that volunteer #82 and #84 were in the same area of Beijing at the same time on April 4th. At this point I might choose to load in the higher resolution raw dataset for this date, so I can get 1-second samples and confirm whether these individuals were close enough for long enough for this to be considered a significant meeting.
Visualize GPS data timelines analysis with KronoGraph
The Geolife dataset is perfect for illustrating the power of applying visual timeline analytics to geospatial datasets. With just a few quick explorations, we could visualize GPS data, identify where an individual lived and worked, where they stayed on vacation, which train route they took across the country, and when and where two people met up.
Could KronoGraph timelines help your behavioral and activity-based intelligence applications? Sign up for a free trial.