Our toolkits are for visualizing networks (or graphs) in data. Transactions fit the graph model perfectly:
Real transaction data is difficult to obtain and share, for obvious reasons. So instead we decided to take a look at Bitcoin – the peer-to-peer electronic currency system.
The common misconception about Bitcoin is that it is anonymous. Anonymity isn’t actually built into the Bitcoin model, but instead is a by-product of the complex public/private key system used to facilitate payments and deter fraud and theft.
With a lot of processing and investigation, individual Bitcoin accounts can be identified.
In a 2011 paper, the Clique Research Cluster (CRC) at University College Dublin demonstrated how public keys could be identified from private transaction keys – effectively allowing Bitcoin transactions tracked and mapped as a graph.
During the project, the Dublin team was able to identify certain key accounts – including the victim of a theft. We’ll use our technology to visualize it.
In the node file, each line is a user followed by their transaction keys
17BptPvonJVA3pLDVjgzLEq7Aujgb1LjPS 1Mp3qWVVjBLCsJhmH65EjvAosViTF13aY8 1BorkLa6yrk1TRwqELhkzi4nCWm8BhXWzL 1AMNhMZC7hCyb8rMVda9E8bEf7FB1RpDAF 13EJ9b8qLH7TARcssSZnZVmyW864ar8J3i 1DnsBgY9KkWWp2xw9pL1Xv1QT145UR5TWp
The link file contains transaction value and timestamp data:
905914 20572 0.01 2011-06-23-19-10-01 905914 622803 220.07592886 2011-06-23-19-10-01 823336 118969 2.12 2011-05-16-01-58-01 823336 330686 0.56210609 2011-05-16-01-58-01
We simply parse this data into a JSON format our toolkits can understand.
Once we’d loaded the data, we did a search for the transactions linked to a single account. At first the result is somewhat overwhelming:
The account we searched for is present here as a central node, with many thousands of inbound transactions placed around it. A yellow link indicates a transaction below 10k btc. Red is used for transactions above 10k btc.
Immediately it stands out that there is a second prominent node with it’s own large orbit of in-bound transactions. Our random naming method labeled this account as ‘Mr U’.
If we filter all the transactions below 10k btc, a more discernible pattern emerges:
The central node is isolated completely – they did not participate in any 10k+ btc transactions during the data collection period. They also do not appear to pay money out – only collect it in small increments. This could indicate it is an online store or service, receiving payment in btc. Or it could indicate the beginnings of a money laundering operation.
Although Mr U still stands out as a very key node in the graph, perhaps more interesting is the chain of very high-value transactions above him, emanating from Mr A.
This is actually a theft. On 13 June 2011, ‘Mr A’s’ slush fund was compromised and payout address changed to ‘Mr D’s’ account.
Here is the same network, with this chain isolated and expanded back to our suspected money launderer (just two hops from the ‘thief’).
Visualizing this data makes it easy to find anomalies and outliers in vast quantities of data. An investigator could also use these charts as a case management tool, adding notes and comments to nodes during the investigation.
This example demonstrates how transaction visualization using tools like KeyLines and ReGraph can rapidly identify interesting areas of activity for more detailed investigation, and improve understanding of events that would be harder to analyze with other techniques.