Hands-on workshop: Real-time fraud detection with graph databases
Fraudsters are moving faster than ever, and traditional methods can’t keep up. But PayPal, Barclays, and Experian are staying ahead by using Aerospike for real-time fraud detection. Now, it’s your turn. In this hands-on workshop, you’ll use Aerospike Graph to build a real-time fraud detection pipeline that spots suspicious patterns as they happen. You’ll leave with the knowledge and a working model to apply graphs to your own high-scale fraud detection challenges.
In this on-demand demo, you’ll learn:
How legacy rule-based and graph systems fail in modern fraud scenarios
The architecture behind Aerospike Graph Service: Combining in-memory and persistent storage
A live demo: Uncover hidden fraud rings through multi-hop relationship queries
How to scale to billions of transactions without sacrificing speed or cost efficiency
Why it matters now
Fraud schemes span multiple accounts and transactions, making connections the key to detection. The ability to see connections across accounts, devices, and transactions is what separates detection from disruption. With Aerospike’s approach, you can:
Surface sophisticated fraud networks in real time
Maintain transaction throughput even with advanced detection logic
Reduce false negatives by leveraging relational insights
Hands-on workshop: Real-time fraud detection with graph databases

Fraudsters are moving faster than ever, and traditional methods can’t keep up. But PayPal, Barclays, and Experian are staying ahead by using Aerospike for real-time fraud detection. Now, it’s your turn. In this hands-on workshop, you’ll use Aerospike Graph to build a real-time fraud detection pipeline that spots suspicious patterns as they happen. You’ll leave with the knowledge and a working model to apply graphs to your own high-scale fraud detection challenges.
In this on-demand demo, you’ll learn:
How legacy rule-based and graph systems fail in modern fraud scenarios
The architecture behind Aerospike Graph Service: Combining in-memory and persistent storage
A live demo: Uncover hidden fraud rings through multi-hop relationship queries
How to scale to billions of transactions without sacrificing speed or cost efficiency
Why it matters now
Fraud schemes span multiple accounts and transactions, making connections the key to detection. The ability to see connections across accounts, devices, and transactions is what separates detection from disruption. With Aerospike’s approach, you can:
Surface sophisticated fraud networks in real time
Maintain transaction throughput even with advanced detection logic
Reduce false negatives by leveraging relational insights
Speaker
Shekhar Suman (00:05):
Hey guys, I'm Shaker and I'm a senior solutions architect at Aerospike and I'll be talking to you today about graph databases and what are some of the things that we can do with them. So graph databases have been there in the market for a while now. There have been multiple graph products that people have been using and they've been pretty good in doing what they do, but we have still not seen a large number of use cases moving to graph use case, so probably to the fact that they are well expensive in the nature in terms of compute or in terms of how they're designed. And moreover, there is a lack of understanding of how graph databases work. So today I have set up this small demo to give you an example or a touch of what graph databases might look like and what value it has.
(00:48):
So as you can see in your screen, I have a small little demo opened up here which says fraud detection. Now fraud is something which has been well researched pretty heavily in the market in the last few years as we have been growing as there has been multiple different ways of frauding people, people have been making various mechanisms of how we can build some fraud detection mechanisms. At the moment in all of these financial systems, financial services, there are far advanced rule-based systems. There are some machine learning models, there are some complex mechanisms in which there are some fraud detection built in, but mostly they are triggered by some events. For example, if there is a fund transfer that happens from outside the country, there would be someone who might call you to validate if that transaction actually happens. If you are the one who is doing that, or if there is a fund transfer that is happening from a certain location which is known for being somewhere where a lot of fraud happens, you might also get a fraud validation kind of a call.
(01:58):
But all of these mechanisms are very difficult to scale and that's what we have been seeing in the last couple of years. And with the advent of ai, with the way we can generate images which are so real that it has become even more and more prevalent and we have to develop some more robust mechanisms of finding fraud and stopping them. Now, it's pretty difficult to do that. If you look at it from top, it's pretty difficult to do that because I'll tell you a little story of what something happened. A few days back, my father got a call from a person and he was actually wearing a dress of a policeman. And I believe lots of you in here in India might have seen that call. He was told by that person that his name has come up in some issue in New Del and he has to travel to new to solve that case.
(02:55):
And if he wants to avoid that, he'll have to pay some money. Now, this is the general premise of how it works usually, that there are usually a lot more people than just one person trying to drive a fraud. There are whole fraud rings in organizations running to make it work, but there's always one thing which is common in all of these different mechanisms of fraud. It could be now they could be asking you money for anything. It could be due to some event, it could be due to some, they are appealing to the humanity in yours or whatever. So it could be anything. But at the base of the things, there is always relationships between data. So there is one person who actually does the fraud and there are many such people who keep on calling people, they collect information about you so that they can make story better and better.
(03:46):
They can send you AI generated images, they can send you AI generated audio and so on and so forth. And then all of these people collected money back to a nodal account or account. Now, not necessarily, sometimes there could be even more complex mechanisms to transfer the funds around, but if you look deeper into the data, you'll always find some relationship between such people and that's what we will be trying to do with graph databases today. So in front of you, as you can see, is a small application that I've built up, which could be, well, could be seen as an application which a bank employee or a fraud analyst would use trigger to validate if there's a possible fraud in a certain transaction. So I have taken up the liberty of uploading lots of users in this and some of these users have been marked as fraud.
(04:40):
Okay, just a second, let me pull up the data. Okay, let's say for example, I look up at this person. This is a user called and she is a fairly straightforward user. She's 24 years old and she has a savings account with some money in it. And as of now, this demo does not have any transactions, so you'll not, oh, let me clear all these transactions first. Yeah. Alright, so if I open this user again, I should not see any transactions. Here we go. So there are no transactions as such, and she has some devices with which she logs in, like we have a couple of mobile phones, a tablet, a desktop, and so on. Okay? Similarly, if I go to one, another user that I have picked up is where is this guy Mr. Atar? And if you look at this guy, he's also a very regular user.
(05:47):
It's a new account, very regular user. He has a couple of accounts, no transactions as such, and a few devices with which he is logged in. There's one more user I would like to show you. This is Mr. Weda, and this person is say the head of the fraud ring. Okay, let's take an example of him being a head of the fraud ring. Now he will be collecting funds from different users and then utilizing that for whatever way he wants to, and then rotate the funds around for cleaning it. Now there again, this account also looks fine, but if you look deeper, there is a certain device that he has logged in into this account, which we have previously marked as fraud. Now, this is where how we identify fraud and what graph databases will allow us to do is to allow us to find this deep relationships between data and do that during runtime.
(06:38):
And there's no advantage to do this if we are finding this. At the end of the day, it has to be done in runtime. And for this to happen in runtime, the query response time should be very good so that you do not block actual normal transactions. So your graph database should be able to handle a large volume of data because you'll be keeping a lot of transactions in this database so that you can keep on looking for those relationships and it should be fast enough so that it does not really hinder or does not cause a tangible difference in the whole transaction processing time. Alright, so let's take an example and let's pick up some users that we had talked about. There's another user called yh that I picked up. He's one of the members in his fraud ring. So if you look at Y's account, you'll see he has multiple accounts, multiple transactions, he has no devices which are previously marked as fraud.
(07:30):
So Y calls up Al and tells a story to Al and asks her to pay him $5,000. And for some reason Al says that, yeah, let me do that. So I have also put up this manual transaction section just to showcase how a transaction would look like. So let's say Al, whose account number is this sends some money to Yash, whose account number is this? Let's say she was duped off 5,000 bucks. Okay? This transaction is created and the fraud detection query was called. And as you can see, this transaction has been marked as clean as of now. If you go into details, this is the source account, this is the destination account, and the fraud analysis says that there is no fraud, there is no trigger that has been raised as of now, which is fairly fine. Now, this person Y keeps on doing his thing and keeps on trying to defraud multiple people during the day.
(08:28):
And at the end of the day, he collects all the funds that he has collected during the day in this certain account and transfers it to, let's say another guy called that I showed you had a flagged account. So what I will do is that I will do another transaction now from Yash to bea, okay? The transaction is let's say of a hundred thousand dollars, okay? Or let's say $10,000. Okay? So the transaction is done and you will see that this transaction will be flagged why this is flagged. If you look at this fraud analysis now, you'll see a risk score activated with that because there is a device which is marked with send with the receiver, which has been previously marked as fraud. Now, this will trigger an event, and at this moment a fraud analyst from our bank or any financial institution might call this person and identify that this looks like a fraudulent transaction.
(09:24):
Would you please confirm that this transaction is valid? Okay? Now that kind of rule depends upon the risk scoring mechanism. Right now, the risk scoring is fairly straightforward. I've just counted the number of accounts which are flagged and accordingly come up with this risk score. Now we can create multiple mechanisms to get this risk score to be better. Now, this transaction was fairly straightforward to identify because this person has a device which was previously marked as fraud. Now, coming back next day, the same thing happens and Mr. Y Ds another guy, Mr. Aba, and he ds him up for let's say a thousand dollars. So ABA also sends this amount of money to Yash a thousand dollars. Now let's understand this first. Yesterday Al was sending this $5,000 to yh and that transaction went through fine because there was nothing wrong with Y's account. But now Y's account has a transaction to weather, and this weather guy has a device which has been divided, marked as fraud.
(10:25):
So if I do it transaction now we will do a to hop query and we will first see that if the sender or the receiver has any transaction, which has a device which is marked as fraud. So we are now looking deeper into relationships with the data, and that has to happen in real time in milliseconds so that we do not actually impact the transaction velocity. So if I go to the transactions now, you'll be able to see that this transaction also have been marked as fraud. If I open this transaction now, and you'll see that this transaction has been marked as fraud because there is a device which is connected to it, which is marked as fraud. Was this a transaction? Yeah, this was the one. So because on the graph, if we go deeper, we have found that there is an account which has a flagged device, and therefore this transaction is also marked as fraud.
(11:20):
Now, this is the power of graph databases that allows us to look deeper into our data and identify relationships between data, which might be very important to see while a fraudulent transaction is going through. Now, the most important thing to take away from here is that again, fraud detection systems and graph databases both have been there in the industry for a while now there are pretty comprehensive rule-based systems that are out there to do fraud detection. But first thing to note here that the volume of fraud detection is staggering high now, and the volume of online transactions are even higher. And with the tools that are coming up, like AI tools to generate videos, AI tools to generate audios, images, and so on, this has become even more and more difficult for people for themselves to identify. So we should have even more robust systems, which allows us to detect fraud.
(12:18):
The more important thing to understand that, to do this, to run a graph query in the data, and there could be billions of transactions happening. Specifically when we are talking about scale of India, there is a large volume of the data that you have to keep and you have to support these queries at sub millisecond latencies, or at least a few millisecond latencies. Now, that has been very difficult to do previously because most of the graph based systems were in memory systems. So all the data stays in memory, and it becomes extremely expensive or prohibitively expensive when you load that much amount of volume of data into an in-memory graph system. But with Aerospike, HMA, which you all know and love, it allows you to have large volumes of data in disc along with the keys or the graph that sits on the memory. How it works is that there are two components of this running as of now.
(13:08):
There is one Aerospike database which runs from the background, which is better to store the data, and then on the top of it runs the A GS or Aerospike Graph Service, which allows us to run these queries. Graph service allows us to run these queries. If you want to see it, I'll share a repository with you where you'll be able to find all of these queries and refer to them. But the whole idea is that now with Aerospike Graph, you can handle very large volumes of data at very high throughput with very, very low latencies, and that's what enables you to build such complex systems like this one. Thank you.