Link Analysis for Fraud Detection

19th July, 2016

Insurance fraud detection is a matter of understanding connections.

To uncover scams, investigators look for anomalous links between people, events, locations and times. They scour huge, noisy, complex, and often incomplete, datasets to understand which connections are genuine, and which could indicate fraud.

Network visualization (or ‘link analysis’ as it is more commonly called by fraud teams) has long been a vital part of the fraud investigators arsenal. Powerful and well-designed network visualization is the ideal tool to overcome data challenges and investigate fraud in an interactive and intuitive way.

Let’s see an example!

Below is an illustration of how the KeyLines network visualization toolkit has been used to investigate insurance claims.

Reviewing insurance claims with KeyLines

Most insurance fraud detection systems work in a similar way. Data is collated on a huge scale, rule scored and typically sorted into three categories: fraud, not fraud and unsure.

A team of analysts then manually reviews the ‘unsures’ – a careful balancing act between keeping genuine customers happy with fast, accurate decisions and preventing real frauds from getting through.

KeyLines is a great way of visually presenting these complex scenarios in a simple format, directly embedded in the investigation workflow.

The fraud detection visual data model

In a real-world deployment of a fraud visualization system, an investigator would be interested in a large number of data points.

Their data model will depend on their business processes, structures and operations. Different approaches will require different data points, but some typical elements will include:

One possible visual model for a fraud detection visualization
One possible visual model for a fraud detection visualization

This model places claims and policies at the top of a hierarchy, with third party and policy holder information on the next level.

For the purposes of this example, however, we have greatly simplified the visual data model:

simplified data model insurance fraud
A simpler visual model
  • Claim – being investigated
  • Vehicle – involved in the claim
  • Claimant – associated with the vehicle (and claim)
  • Address – at which the claimant lives

Step 1: Load a claim

This claim folder involves two vehicles and three claimants, associated with three separate addresses.
This claim folder involves two vehicles and three claimants, associated with three separate addresses.

Our first step is to load our disputed claim, using the hierarchy layout to simplify the view.

In this example, we have two people (Stephen Porter and Julia Rodriguez) claiming for damage to their vehicles. An additional person, Everett Page, is included in the claim as a witness.

Step 2: Find matches

After loading a claim, the user is offered the ability to ‘find matches’. This runs a query back to the database (in this example we used the DataStax DSE Graph and Gremlin query language under the hood) to find all other claims sharing any similar attributes:

Here we can see claims with shared attributes side-by-side
Here we can see claims with shared attributes side-by-side

Doing so returns two other claims – 2015-06-07 and 2015-03-16 – which show matches on the vehicle and address of a claimant in the original case being investigated.

Step 3: Combine matches

To emphasize unusual connections, we can use KeyLines’ combine feature to merge identical nodes:

graph visualization - insurance fraud - keylines - screenshot3

This adjusts the layout of our network so we can more easily see unusual connections.

graph visualization - insurance fraud - keylines - screenshot4

Here we can see our original claimant’s address in Colnbrook Street is associated with an earlier claim. Given they share a surname, however, it is not necessarily suspicious.

Step 4: Escalate or accept

This example is a more suspicious match.

graph visualization - insurance fraud - keylines - screenshot5

Our witness, Everett Page, shares an address with a man named Walter Stewart, who has previously made a claim relating to one of the vehicles involved in this incident.

At this point an investigator would probably want to escalate this case for special investigation. We can use a context menu to help our analyst progress this further through the workflow:

accept or investigate

Representing data as a network offers an engaging way for analysts to rapidly understand events. By incorporating KeyLines into existing claims management workflow, we have made the process simple and intuitive.

Find fraud in your own data

This is a simple example, using synthesized data, of how KeyLines can make a complex and high-risk exercise simpler and more intuitive.

With the help of a number of KeyLines features, including layouts and combos, we have built simplified data from multiple policies into a single chart and identified a potential incidence of fraud.

If you want to try KeyLines on your own data, just request a trial account.

Download the white paper

To learn more about KeyLines, download our updated white paper. It contains three more in-depth examples of using graph visualization for fraud detection.

Download the White Paper

| |

Subscribe to our newsletter

Get occasional data visualization updates, stories and best practice tips by email