Link Analysis for Fraud Detection

19th July, 2016

Insurance fraud detection is a matter of understanding connections.

To uncover scams, investigators look for anomalous links between people, events, locations and times. They scour huge, noisy, complex, and often incomplete, datasets to understand which connections are genuine, and which could indicate fraud.

Network visualization – or ‘link analysis’ as it is more commonly called by fraud teams – has long been a vital part of the fraud investigators arsenal. Powerful and well-designed link analysis tool is the ideal way to overcome data challenges and investigate fraud in an interactive and intuitive way.

Looking for a way to visualize your fraud data?
Start a free trial of our link analysis toolkits

Below is an illustration of how our link analysis toolkits are used to investigate insurance claims.

Reviewing insurance claims with link analysis

Most insurance fraud detection systems work in a similar way. Data is collated on a huge scale, rule scored and typically sorted into three categories: fraud, not fraud and unsure.

A team of analysts then manually reviews the ‘unsures’ – a careful balancing act between keeping genuine customers happy with fast, accurate decisions and preventing real frauds from getting through.

Link analysis, powered by our toolkits, is a great way of visually presenting these complex scenarios in a simple format, directly embedded in the investigation workflow.

The fraud detection visual data model

In a real-world deployment of a fraud link analysis system, an investigator would be interested in a large number of data points.

Their data model will depend on their business processes, structures, and operations. Different approaches will require different data points, but some typical elements will include:

One possible visual model for a fraud detection visualization
A possible visual model for a fraud detection link analysis

This model places claims and policies at the top of a hierarchy, with third party and policyholder information on the next level.

For the purposes of this example, however, we have greatly simplified the visual data model:

simplified data model insurance fraud
A simpler visual model
  • Claim – being investigated
  • Vehicle – involved in the claim
  • Claimant – associated with the vehicle (and claim)
  • Address – at which the claimant lives

Step 1: Load a claim

This claim folder involves two vehicles and three claimants, associated with three separate addresses.
This claim folder involves two vehicles and three claimants, associated with three separate addresses.

Our first step is to load our disputed claim, using the hierarchy layout to simplify the view.

In this example, we have two people (Stephen Porter and Julia Rodriguez) claiming for damage to their vehicles. An additional person, Everett Page, is included in the claim as a witness.

Step 2: Find matches

After loading a claim, the user is offered the ability to ‘find matches’. This runs a query back to the database (in this example we used the DataStax DSE Graph and Gremlin query language under the hood) to find all other claims sharing any similar attributes:

Here we can see claims with shared attributes side-by-side
Here we can see claims with shared attributes side-by-side

Doing so returns two other claims – 2015-06-07 and 2015-03-16 – which show matches on the vehicle and address of a claimant in the original case being investigated.

Step 3: Combine matches

To emphasize unusual connections, we can use our toolkits’ combine feature to merge identical nodes:

graph visualization - insurance fraud - keylines - screenshot3
Combining identical / duplicate nodes will remove chart clutter and make anomalies easier to detect

This adjusts the layout of our network so we can more easily see unusual connections.

graph visualization - insurance fraud - keylines - screenshot4
The hierarchy layout reveals relationships between nodes, highlighting unusual connections

Here we can see our original claimant’s address in Colnbrook Street is associated with an earlier claim. Given they share a surname, however, it is not necessarily suspicious.

Step 4: Escalate or accept

This example is a more suspicious match.

graph visualization - insurance fraud - keylines - screenshot5
A suspicious connection – why does Page Everett (witness) share an address with Walter Stweart (claimant)?

Our witness, Everett Page, shares an address with a man named Walter Stewart, who has previously made a claim relating to one of the vehicles involved in this incident.

At this point, an investigator would probably want to escalate this case for special investigation. We can use a context menu to help our analyst progress this further through the workflow:

accept or investigate
Link analysis tools should be embedded directly into the fraud analyst’s workflow

Representing data as a network offers an engaging way for analysts to rapidly understand events. By incorporating KeyLines into existing claims management workflow, we have made the process simple and intuitive.

Find fraud in your own data

This is a simple example, using synthesized data, of how KeyLines can make a complex and high-risk exercise simpler and more intuitive.

With the help of a number of KeyLines features, including layouts and combos, we have built simplified data from multiple policies into a single chart and identified a potential incidence of fraud.

If you want to try our toolkits on your own data, just request a trial account.

Download the white paper

To learn more about link analysis and fraud detection, download our updated white paper. It contains more in-depth examples and best practice advice.

visualizing fraud networksvisualizing fraud networks

Download the White Paper

More from our blog

Visit our blog