Blog

Where graph databases fit in your fraud detection plans

Graph databases are revolutionizing fraud detection in real-time transactions.

January 31, 2024 | 4 min read
george-demarest-600x600-1
George Demarest
Director of Product Marketing

Software-driven fraud detection has been a constant since the dawn of the computer age. But with the globalization of commerce, culture, and access to technology comes the growing proliferation and sophistication of bad actors and the tools that enable them. Therefore, the tools to detect and defeat fraud have had to keep pace with the evolving threat landscape.

To financial institutions, e-commerce companies, and service providers of all types, fraud is a top-line and bottom-line issue. A FICO survey shows that 1 in 6 customers will switch banks if they’re unhappy with how they handle scams.

PayPal’s bespoke graph solution

The progression of fraud detection from advanced analytics to rules engines, behavioral analytics, and machine learning is well captured in Gartner’s chart below. It was used by PayPal when they described their graph database journey in their “Graph on Aerospike” presentation at the Aerospike Summit. In their presentation, PayPal argued that the most common tools of fraud detection – rules engines, advanced analytics on discrete, static data – while sophisticated in and of themselves, are not predictive and don’t surface newer and more insidious types of fraud.

where-graph-databases-fit-in-your-fraud-detection-plans-fraud-prevention-capability-levels-chart

Figure 1 – Gartner fraud prevention capability levels (annotated)

To move up and to the right of the diagram, PayPal needed to be able to analyze behaviors, understand subtle relationships, and, in particular, use graph analysis and graph analytics to identify and mitigate fraud in rapidly evolving market conditions.

Using graph database technology to combat fraud

As one of the pioneers of online payments, PayPal has, by necessity, become a pioneer of online fraud detection and mitigation. Their embrace of graph technology for their fraud program predates our release of Aerospike Graph by several years. They chose Aerospike as the data store and Tinkerpop and Gremlin as the graph compute engine and query language, respectively. So, while Aerospike Graph is not based directly on the PayPal solution, one can say that the PayPal solution inspired Aerospike Graph.

PayPal’s graph journey is described in a blog called How PayPal leverages real-time graph capabilities in fraud detection by Aerospike’s Subhashish Bose. In it, Bose (as we call him) described the long history of PayPal’s use of Aerospike as a key-value store, starting in 2015. One of the key learnings in the PayPal example was to understand the context of activities and the behaviors of fraudsters using a temporal graph to signify high fraud risk—things like new user registrations, login events, profile changes, purchases, and wallet transactions. Finding anomalies in the timing and order of these activities turned out to be graph database operations.

But to make this type of fraud detection practical (i.e., real-time), PayPal needed to devise a graph query service that could retrieve query results involving multiple hops within a few milliseconds. The results of these queries are then used in AI models along with the other data features to check the likelihood of fraud.

Bose has authored another relevant blog called Leveraging graph databases in real-time transactional fraud detection. In it, he explains the benefit of using graph technology over traditional behavioral profiling-based approaches and looks into the different data components of a graph-based fraud detection system.

where-graph-databases-fit-in-your-fraud-detection-plans-fraud-prevention-visualized-graph-data-model

Figure 2: Visualized graph data model for a fraud scenario

In his blog, Bose states, “Graph technology can add context about the transaction using the concept of knowledge graphs — what else do we know for each of the data points?” Graph data supports understanding relationships by coordinating questions like:

  • Has the customer used the device (endpoint) previously?

  • Are there transactions from other customers on this endpoint?

  • Are there any linkages between the customers?

  • Are all the transactions in this network genuine?

Keep fraudsters at bay

As the threat landscape evolves, so must our tools to identify and prevent this new generation of fraud. Any successful fraud prevention program must now consider adding graph databases such as Aerospike Graph to their data pipelines. Need drove a payment industry pioneer like PayPal to develop their own solution. Now, years later, you don’t have to.