WEBINARS

Leveraging graph databases for real-time fraud detection

Please accept marketing cookies to view this content.

George Demarest:

So thanks everyone for joining. This is the Aerospike webinar, Leveraging Graph Databases for Real-Time Fraud Detection. And I have with me three experts from Aerospike. Stuart Tarmy is the head of fintech solutions at Aerospike. He has written extensively and talks a lot about fraud detection, especially in real-time systems. We have Ishaan Biswas, who is the director of product management at Aerospike, responsible for Aerospike Graph, among other things. And we have Subhashish Bose, we call him Bose, and he is head of financial services industry vertical at Aerospike. So got a lot of good experience related to the topic. So let us begin by talking about what's on the agenda for today.

So first, Stuart is going to talk about the state of fraud and fraud prevention. It is a very high tech business these days, and Stuart has pretty much seen it all, so he will describe the landscape. Then Bose has actually dug a bit deeper into the technology, and he's going to talk about applying graph data model, graph technology to fraud in some specific ways. And then we're going to drill down even deeper and talk about what a graph database is and what are the requirements for a graph database for real-time fraud prevention. And finally, we'll give you a brief introduction to Aerospike Graph and then we'll finish.

So with that, I want to introduce Stuart. Stuart Tarmy, as I said, head of fintech solutions at Aerospike. You can see some of his blogs. He writes extensively about financial services topics, especially fraud and especially real-time systems, payment systems, and risk, and all of those kind of cool things. Thank you for joining, Stuart. Can you please give us a high level overview of what's going on with online fraud?

Stuart Tarmy:

Sure. And George, thank you for that introduction. So it's great to be here. So let me share some thoughts at the high level for fraud of some of the trends and it'll be a setup for the later speakers.

So at a high, high level, there's two main things going on with fraud. The first is, are you who you say you are? And George, if you could hit the button to make things go down. Keep going, keep going. That's good. So the first thing is, again, are you who you say you are? So there's things in this category like identity fraud, synthetic identity. Synthetic identity is when people are actually making up fake identities. They're putting in fake Social Security numbers and trying to make fake driver's license, things like that. There's something called deep fakes, a little more sophisticated, where people are using Adobe Photoshop and AI type tools to embed someone else's photograph into someone else's driver's license. So a lot of fraud here on are you who you say you are, and fraud systems have to figure that out.

The next part of this is even if you are who you say you are, you can still do bad things. And some examples here are theft. You can do all sorts of purchasing of things, knowing you may go into default or bankruptcy. These are all very, very bad things for the financial institutions and merchants who were in the payment ecosystem. So these are two of the major things that you're trying to prevent with fraud.

So let's go to the next slide. Real quickly, fraud has a lot of cost to a business. So George, if you can just go down upon these things. Keep going. So there's losses, there's chargebacks, which is when something's returned, the merchant has to return things. There's interchange fees. That's the three percent that gets paid to the MasterCard, Visas, Discover cards, investigation costs, reconciliation, false positives, false negatives. A false positive, for one who may not know, is when your credit card is declined and it shouldn't have been, and you go ahead and pull out someone else's credit card. It's a very bad thing for the bank. So there's a lot of costs to the business.

But one of the things I like to share with people is that if you can get a better handle around your fraud risks, you can use this to drive your business. And George, if you can go down here, better analyzing fraud risk profiles at the customer level, it will allow you to better personalize things. And this will allow you to grow revenue, increase profitability, because here's now marketing to a customer of one.

So an easy example, obvious example is if I know that somebody's fraud propensity is very low, I'm willing to offer them better terms and interest rates and fees on a credit card to win their business. So even though we're trying to prevent fraud, a lot of those analytics can be used on your marketing side. Let's go to the next slide.

You can reduce churn and differentiate your business. So go ahead and go forward, George. Okay, so just a quick thing here. So fraud is increasing. It's expected, it's more than tripled since 2011. And you can see in charts, it's going to grow. It's interesting that fraud is growing because the smart card, the chip on people's credit cards, has almost eliminated fraud on credit cards when it's present. And what's happened in the industry is the fraud has moved online and over telephone. So even though the chip has reduced fraud quite a bit, the fraud has migrated.

The right-hand slide here just is an interesting slide. What it shows is that credit card fraud, you can see in the orange, has been the most dominant type of fraud out there, but you can see during COVID it skyrocketed, when people weren't going out, and this went into food stamp and benefits fraud. So it's fraud across of a lot of different ways in large numbers. Next slide.

So three key requirements for a best of breed AI system to combat fraud. So the first thing you want to be able to do is develop the best AI algorithms you can and to there you need to get some of the best programmers. I've listed on the left-hand side some very common, I think it's about a dozen AI algorithms. In reality, I think DataRobot has over a hundred. Firms are using these. And the most sophisticated companies, like a PayPal as an example, are using neural nets and deep learning to get very sophisticated and really reduce their fraud.

The other thing you need is to process large volumes of data. You want to be able to do it in terabytes, petabytes, sometimes even into the exabytes. What people say is the more data you have, the better the systems can be. That always bothers me because there are diminishing returns at a certain point, but you want to be able to process large volumes of data.

And the last thing is extremely important for a couple reasons, which is real-time processing in milliseconds type performance. And what you're trying to do here is not only do it fast to give a good customer experience, but the trick is if you have a system like an Aerospike that can do it extremely, extremely fast, you can use the most sophisticated algorithms. Unless you can use that huge amount of data very fast, you will never get to neural nets. And that's why you need to have a system like Aerospike to bring this all together. So let's see, next slide.

So just some quick use cases for Aerospike. We're a very dominant player for fraud and AI type systems for fraud. We do work for Barclays, where they have been able to take all their fraud systems that exist in different silos. It's their banking system, their mortgage system, their car system, auto loans, and consolidate it into a single system and share the fraud metrics using Aerospike in real time. You can see in the right-hand side Early Warning. So Early Warning is the owner of Zelle, which is the big payment system. It's very large. It processes roughly the same amount of transactions as PayPal. And there we process their fraud. We beat out some competitors, you can see there [inaudible 00:08:34], so they could have the scale they need and help them meet their SLAs 99.99% of the time to process their algorithms.

On the bottom left real quick is LexisNexis. They work to help companies look at digital identities. The idea here is that if I had a website that sold sweaters, let's say, and somebody came to my website and I want to know if they're a fraudster, what I could do is contract with LexisNexis, ThreatMetrix, put a little code on my site. Someone comes to my site, it'll send the API a request back to ThreatMetrix and ask, is there identity fraud going on? And you can see here, they power with the Aerospike about 40,000 websites at 130 million transactions daily and do it in real-time.

And the last one is a little less fraud, but it's payments and it shows our scale. We power the TIPS, or TARGET Instant Payment System, for the European Central Bank, which is the interbank real-time payment system across the EU. And you can see just here the huge volume of tens of millions of transactions daily 24/7/365. The key requirement here using Aerospike was to not only do it in real-time, but an even bigger requirement, quite frankly, was just 24/7, never go down. And that's what Aerospike provides to them. So just a little snapshot there. So I think George, I think that's the last slide. And back to you.

George Demarest:

Thanks, Stuart. Yeah, this slide always, it gets across the dimension of scale and throughput and low latency that is a hallmark of Aerospike. We've kind of been a closely guarded secret for some of the toughest workloads out there. So fraud being of course a huge compute challenge, and getting the most out of your infrastructure is really both a top line and a bottom line challenge.

So to dig a little bit deeper on the technology, I mentioned Bose has been delving into this and wrote two recent very good blogs on the subject, first on leveraging graph databases in fraud detection and real-time fraud detection. And the second is how one of Aerospike's customers, PayPal, has deployed graph technology using Aerospike in their fraud detection capabilities.

Bose, welcome. Thank you for joining. This first diagram I've included is a little bit dated, but it gets across the point that there's fraud and then there's fraud. There's some you can do with just good old rules, engines, and static analytics, but when you get into the tougher and more sophisticated types of fraud, you really need better tools.

Subhashish Bose:

Right, George, and I would like to begin by saying that I've spent a lot of time in fraud detection, fraud prevention areas, and definitely graph, it's not a new concept, it's been there since about 10 or 12 years. And the very first time, I remember way back in 2012 or 2013 when we started looking at graph to better detect identity theft, in those days we didn't have graph databases and definitely not real-time graph database capabilities. And we used to leverage the old school relational databases, extract data, and from there populate all these visualizations that would give a clue about what's going on.

And this chart came in, I think as I mentioned, George, in about 2018. And this is a good chart in the sense that this has been with most of the fraud strategists around the world as their guiding point. And the way to look at this is that as you progress up this ladder, all these steps, as you move to the next step, you're not necessarily doing away with the previous step. So that's something you need to keep. It's an important part of that. And you need to constantly improve upon that.

The second thing, you'd notice that as you go up in this chain and as you go towards that continuous risk assessment and continuous improvement, if you will, you have moved already from a very discrete way of analyzing your data in different silos to a more connected way of analyzing that data. And if you look at the graph, entity relation graph, which is right at the middle of this chart, so graph is the capability that enables this continuous connected data analysis. So that's very critical to note here. Of course these days graph is being used very widely in a real-time fashion. But as I was mentioning, I mean, over the years this has been possible due to advancements in technology. And we'll delve a bit deeper into that, kind of the area towards the end of the talk today. Over to the next slide, George, if you will.

All right, so before I go into some specific fraud scenarios, I would like to quickly touch upon what are these broad applications of graph in the area of fraud and compliance. And notice that I've added compliance here, even though the webinar is strictly about fraud. I mean, the reason is obvious, fraud is the crime while money laundering is the means of hiding and making that money look clean. So from that perspective, they are very related.

So the way I like to think about this is by looking at this in a two-by-two matrix, fraud prevention and compliance, these two functions in the horizontal, and the customer lifecycle, from the perspective of acquiring a new customer to ongoing monitoring, in the vertical axis. So the first block on the top left, which is that of application fraud, I think this is one area where graph technology is very, very, almost a mandatory requirement in a sense. The reason being that a lot of these, and we touched upon these two, touched upon these earlier, ID theft, which is basically someone actually stealing your genuine, legitimate identity and are doing a third party fraud on that, while a first party fraud is something where a genuine person is making a request for a loan with the intention of not paying.

And this particular one is very hard to detect because obviously it's a genuine identity and often it ends up in the bad debt of the organization. So the various estimates that as much as 35% of the bad debt could actually be prevented if you had stronger first party fraud controls.

And lastly, synthetic identity fraud, which sits somewhere in the middle, is where someone's actual identity is actually used, the actual identity, the Social Security number or a national identity. But everything else, the credentials, the address, the names, et cetera, all of those are synthetically made up. So that presents, and often AI is being used in this area to make up this kind of credentials, a credential stuffing, if you will. And that is then used together with the true identity to request for fraud.

But all of these situations, what really happens is with graph, you're able to find these linkages to any known past offenders or a particular identity, a particular phone number, a particular email address, which has been also used in the past for other loans. Maybe at that time you didn't have the capability to detect they were fraud, but over time they resulted in a bad debt. So therefore by association, a lot of these times graph basically is association and that is the first sign of guilt in a way.

So that is what is used to detect an identity fraud. Similarly, on the transaction side, once the customer is on board, of course the account take over, but scams, right? So let me pause there for a minute. So scams are a very big problem right now and definitely post the pandemic world. We have seen around the world scams have grown almost double to triple of the volumes they were pre-pandemic levels, and of course COVID triggered a lot of that in the sense that all these government benefits which were then exploited by the fraudsters.

And one specific problem about scams is that a majority of them are what we call them authorized push payment frauds, which means that the customer is actually authorizing that payment, authenticating himself, genuinely believing that he's helping someone who he's just met online on a dating kind of forum or making a genuine investment. But in reality it is not. So from that perspective, it's very, very hard, and traditional models would fail here. The reason being that as soon as you call the customer to confirm if it's fraud, he would deny, he would say that yes, it's a genuine transaction. So you need to handle it differently. And for that you need to be able to predict that there's a high likelihood that this transaction could be a scam.

And then lastly, internal fraud. So internal fraud is, I mean, we all know it's employee fraud, and ACFE, the Association of Certified Fraud Examiners, they have estimated that as much as five percent of global revenues of large corporations is logged away in internal fraud. And this particular area definitely, I mean if you think about it, often employees collude. So employees would collude with an external fraudster to kind of siphon off funds from the banks or any organization's accounts. Or you take another example, a loan underwriter from a bank, he approves the fraudulent loan request from a friend or his spouse with the intention of defrauding this money. So there's a lot of linkages that you need to uncover. Are there any common households, addresses, email addresses, even IP addresses, devices, et cetera that is being used in these transactions?

If you go to the bottom right, sorry, the bottom left, the first box over there, the know your customer, coming in the compliance area, and AML, and you'll often hear this, AML, KYC and graph is like a marriage made in heaven. So if you look at know your customer, so for example, name screening, right here what you're trying to do is to find any links with any kind of known politically exposed persons, or for that matter, any sanctions lists and so on and so forth. And often you need to connect various data, not just the single name, but various entities. And then find if there are any linkages.

Similarly, a shell company investigation or online detection kind of a capability would require you to go through tons of data and find are they these fake companies? So I don't know if you remember Mossack Fonseca and the Panama papers, that particular data, and if you remember, 200,000 odd entities were involved in that leak, and ICIJ, they actually used graph technology to scan through all of that data and they were able to find these tax evasions, wrongdoings on the part of rich global elite worldwide.

And then finally the ultimate beneficial ownership, which is basically who is the actual owner of an organization. And this becomes tricky, as there are multiple layers. For example, if you have a single set of layer [inaudible 00:21:27] directors for a particular company it's pretty easy, but then what happens is that they would be structuring it in a way so that a vast majority of the ownership with another company, that company's owned by another company and so on and so forth. So that becomes very tricky. And with graph, this particular problem of finding out who the UBO is, it's a single graph query. So that's a very effective way of determining and a very fast way of determining who the ultimate beneficial owner is.

And then lastly, in the anti-money laundering space, when it comes to ongoing monitoring, transaction monitoring, one problem is that of money mules, finding all these accounts which have been set up with the intention of layering, with the intention of hiding the money trail. So you often look for ways in which money is moving into money mules and coming out of it quickly. And there are multiple such mules typically in a crime syndicate, smurfing or structuring, which involves breaking up a transaction into multiple smaller ones to not get caught in those alert thresholds that require a suspicious activity report filing, and then round tripping, which is basically moving the money through various entities, accounts, businesses, back to your own accounts.

So these are some of the ways in which graph can help. And a core feature of that, I would like to emphasize a bit on that, is the syndicate monitoring. So this is, if you will, a typical level three kind of a defense, where various functions are collating the data and that is being used to monitor continuously things like fraud rings and things like criminal syndicates. And when you monitor this, and later on we'll see a live example of how it might look like, you then are building on more intelligence, you are then flagging a particular syndicate, a particular fraud ring, and then trying to eliminate that from the next possible attempt on your organization. So next slide, George.

All right, so let's take a particular example and focus a bit deeper into the fraud detection, the modus operandi, if you will. So in this particular case, this is a sample taken and what you're seeing over here is that one particular day a guest checkout is happening on a customer, Bob Mule, his PayPal account. And if you see on the right, so what you're seeing is that these are linkages of the usual behavior of the customer in terms of what are the merchants where this person is shopping or what are those merchant category codes? Because often, we're all creatures of habit. So in the sense that we typically want to do certain behaviors, over time it's very easy to glean that behavior out from the data.

So in this particular case, what we are seeing is that he lives in a particular address, purchases items from two well-known online electronic shops which are belonging to the same merchant category code. On one particular day, there is a checkout using this particular person's PayPal account and through an unverified device, but using the same PayPal handle. So by itself this may not be such a risky event. But then if we dig deeper, and this is where graph comes in, we are seeing that this particular account, this particular device, has been used in the past to make a purchase from a merchant which has been fraudulently identified as a blacklist. So those things, when you start putting them together, then you realize that this particular transaction should be stopped.

So that is where broadly graph helps. So what you're trying to do is you're trying to discover links between customers and various entities. These entities could be IP addresses, device IDs, phone numbers, emails, addresses, merchants, any other entities present in the network. And then you're trying to address broadly three kinds of questions. First of all, is a relationship already present? If a relationship is present, that means it's good, it's regular behavior. If a relationship does not exist, are there relationships with other customers? For example, is this merchant a popular merchant which has been used by other customers? And if yes, that again means that yes, there's a little bit less of a risk. And then lastly, are all these transactions genuine in the network that we have in the first, second, third degrees of connections? Are we seeing any red flags?

So this basically defines at a broad level how graph data models look like. And at any point of time, if you identify these questions with a no, you'll see that the risk generally increases while a yes typically means the risk is decreasing. So it's helping both in the fraud detection as well as lowering of the false positives, which is also very, very critical. And then of course you also use these graphs from an investigative perspective to detect fraud rings and being able to identify these blacklisted entities a bit proactively so that you can stop these frauds in real-time as and when they happen. Next slide, George.

So I would like to summarize from a graph perspective this particular case study from PayPal. So PayPal has been using Aereospike for fraud detection for a number of years as a database. And over the years what they built was that initially the journey was more limited to just using it as a high good amount of a feature store, kind of developing more and more features to power their models. But then over time, they also looked at graph. And one of the things that they did was that they looked at the data from two perspective, an online mode and a post online mode. So I'll explain a bit more about that.

So imagine when a person makes an actual financial transaction, makes a purchase on an eCommerce website and stuff like that. That particular financial transaction, if you see the top arrow, it's being a decision by a real-time fraud decision service. These are, again, machine learning models that they have embedded into the ecosystem using popular languages like Python, Scala, and Java. And these fraud decision services in turn invoke a graph query service which is built on Gremlin and that again dips into Aerospike, where currently they have about roughly nine petabytes of data within various Aerospike clusters. That's a huge amount of data altogether, one trillion vertices and edges.

And that data is looked up to find any kind of graph queries. So in terms of any linkages with known fraudsters, et cetera, which we talked about some examples in the previous slide. So using that, all these graph queries then feed graph features or graph machine learning, which is then embedded inside their fraud decision service. So this real-time leg allows them to use graph querying to use it in their real-time fraud detection service.

Secondly, often in the area of fraud detection, you'll hear this, that non-financial events or non-monetary events, if you will, are also very important to look at because often they give you certain clues about the risks. So for example, has there been a new registration or a change in the phone number? Any other changes in the profile, changes in password, these are all indicators of specific account takeover activity by fraudsters. So all of that data, the non-financial data, together with the financial data, it also triggers a post-event in parallel flow. If you see using [inaudible 00:30:07], it comes into Aerospike. That again builds all these graph computations and populates the graph database with the vertices and the indexes and the links between the various nodes. So that continuously feeds in building the graph over time. And then there's an offline batch process, as well, which serves as the backup and disaster recovery.

On top of all this, there is a graph viewer, which is used, as I mentioned earlier, for community detection. So using this, they can then monitor for fraud rings and then proactively also blacklist certain entities which are investigated to be fraudulent. So with this setup, over time PayPal was able to see a very good return on investment. If you look at their fraud metrics, the fraud detection rate jumped to 98%, post applying the graph capabilities. Their SLAs in terms of false positives was also overachieved in the sense that they were able to reduce their false positives by 30 times, 30X reduction in false positives. So this was a very good success metric for them. Fraud dollars-wise, they were able to save close to a billion dollars a year with this capability.

So I'll end there and I'll hand it back to you, George, and take it from there.

George Demarest:

Thanks, Bose. One of the most obvious things if you do a little bit of digging around graph databases and the graph data model is it just doesn't look like rows and columns. It's a specialized way to view and consider data, and that means the technology that powers it and reads and writes and so forth needs a specific type of technology. And one of those things is the graph database, and that's of course one of our topics today.

So that's why I want to bring in Ishaan, who has more than anyone at Aerospike been involved in bringing our Aerospike graph database to market. And he has been studying this for quite a while. He has a lot of content that I recommend you reading, including just an introduction to Aerospike Graph and the tools we have. So welcome, Ishaan. This is a different animal.

Ishaan Biswas:

Yeah, and thanks for having me here, George. It's a hard act to follow after Stuart and Bose, who are experts in this area, but I'll try my best. So Stuart mentioned a couple of things, specifically around when you're building a real-time intelligent fraud detection system, you apply a large amount of data and use that in real-time. And from what Bose mentioned, if that data, instead of being discrete entities, if they're actually interconnected data sets and it actually forms the graph data model, then if you can apply that in real-time, that unlocks a lot of superpowers for your system.

Over the years, we've seen several customers of ours build systems like this using different technologies. The real potential and what graphs unlock for you is instead of building custom implementations, which are kind of static with respect to your knowledge of fraud at a time, you're limited to the number of hops you can do. So you can maybe go from a user to their household to another user to a device or something like that. With graphs, you're not really limited by that, especially if you are using a very high performance graph database system. You can be limited only by your imagination within a certain SLA with latency, but it unlocks a lot of paths for where you can detect fraud. So graphs that basically allow you to investigate data from a network perspective.

And with that, data scientists can find features of frauds or fraud signals in your network and then apply that. You can apply that in your graph data model. And really the holy grail is for this to be a continuous real-time learning system where either it detects these fraud signals on its own or there's manual intervention where you find these fraud signals and feed it back, and that takes effect in a matter of minutes or hours.

So graphs are not new. Graph databases are not new. They've been around for quite a while. In order for you to do all of these things, especially when it comes to fraud where you're using a lot of different signals and identifiers of people, it ends up being a lot of data. Like an effective system, it needs to have terabytes of data usually.

And existing solutions out there in the market today, they're overly reliant on DRAM. So they usually require you to cache most of your data, if not the entire data in the DRAM, which makes it really hard to scale beyond the memory footprint of your machine. And so naturally your performance is also unpredictable as your data volumes grow because the moment you have a cache missed in your DRAM, you're seeing disk speeds which you can't really afford when you're trying to save money and detect fraud.

Of course you can build, and we have several existing customers who build customer implementations of some sort of a graph solution using Aerospike, but could be any other technology. There are challenges with that, too. You might be able to get around the limitations of most commercial graph databases today, but it's very costly and complex to develop and maintain those systems. You typically, there's a lot of hidden human capital costs there because you need a team of three to 10 really skilled developers to be able to build and maintain these systems. And naturally, it creates technical debt as any software does, but more importantly, it limits your agility and adaptability to dynamic market conditions.

Like Stuart and Bose both mentioned, the frauds, the way people do fraud and the kinds of fraud that people do rapidly changes as the macroeconomic environment changes. So you need to be able to absorb those changes and learn from that and make your system adaptable to use those new pieces of information that you get. So these were mainly the reasons that we heard from a lot of our customers, which led us on this path of Aerospike Graph, building this product. If we can move to the next slide.

And we talked to a lot of customers and tried to figure out what these customers really need. At the core of it, you need infrastructure and a database really that provides scale, predictability, and affordability. Graph databases tend to run up your bill pretty quickly because, like I mentioned, of the reliance on DRAM. So your license costs and your infrastructure cost just keeps going up and up as your data volumes increase and you don't necessarily get the performance that you require, especially P99s.

So really the requirements boil down to you should have a system that's highly scalable. As your data volumes grow, your systems should continue to be as responsive as it is today, with maybe a smaller set of data. It should provide you predictable low latency at any scale. So it could be single digit millisecond latency or tens of milliseconds or hundreds, depending on how complex your queries are, but it should be predictable. It shouldn't be that a hundred gig, you get a certain P99 and when you're at 500 gig, your P99 just goes through the roof.

And the other thing is, especially if you're a successful business and there are a lot of fraudsters in your system, you need to be able to do this at high throughput, which means that there might be a lot of fraud signals that you're getting. So you need to be able to detect fraud at a high concurrency and throughput level.

Those are just performance metrics. Other than that, you need a system that can handle heavy read and write queries because you need to be able to capture new information and serve your fraud service APIs. So that means that it should be able to handle both OLTP and overlap queries, so both transactional queries that are capturing event streams and a lot of high throughput data, but also be able to do analytics on that same data set.

Regardless, needless to say, the system also needs to have really strong reliability. This system that you're going to build, the fraud detection system, it's going to definitely improve your bottom line. So you don't want there to be any downtime in your systems. You need like [inaudible 00:39:56] uptime. And of course you need all of this with an affordable TCO model. You can't build a solution that you take to your CFO and then the project gets kicked down because it's just not affordable and it's not scalable or sustainable to run. So what do you do? If you can go to the next slide.

These are all the high level requirements that we had in mind when we built Aerospike Graph. So if you've heard of Aerospike, our existing users, you already know this, that the Aerospike database is massively scalable, and with Aerospike Graph, that same benefits of being able to get to a data item really, really fast, in a matter of milliseconds, applies here as well. So you can scale to billions and trillions of vertices and edges.

And the interesting thing that we've done as well, as an aside, is we have an independently scalable computing storage model. So as your workload throughput requirements change or your application throughput requirements change, you can just add more nodes to capture a spike in your throughput just for the compute portion while the database cluster remains the same. And conversely, as your data volumes grow, you can slowly increase your database footprint as well without affecting your throughput.

The other thing that Aerospike is known for is real-time latencies. So real-time OLTP performance is another thing that we put a lot of time and energy in building with Aerospike Graph. So you can build and query a graph in real time, which unlocks rapid insights and decisioning. And you get all of this with a native graph query language. So you use a model native query language. The language we support today is Gremlin.

So you write your business logic directly in Gremlin. What this allows you to do is as a developer, you can have a very fluent conversation with other stakeholders in your organization. So if there's a business stakeholder who maybe understands, or a fraud expert in your organization who understands the fraud parts better but is not conversant in database terminology, you can have a very fluent conversation with them because the business process looks very close or your eventual driven queries will translate really well to your business logic.

So as your whiteboard stuff, not only is the data model really lends itself really well to whiteboarding because what you discuss and decide on in the whiteboard is how the data is loaded and stored in the database, but it's also queried in a very intuitive way. And that brings me to bulk loading. So we also have tools where you can bulk load a lot of data. So to really enable the OLTP and OLAP cycle or the online and offline cycle, where you can have a different system perhaps, or even Aerospike where you're really building the graph and finding features in your graph that you want to add as properties or edges and whatnot to find fraud, and bulk load that data into the database and then do that as regular occurrence, like whether it's a daily thing, daily job, or weekly, monthly, whatever your cycle is. But we allow all of that.

And no one wants really another database that is special purpose built for just one application. So having a multimodal database that also locks graph along with key value and documents and vector, all of these things in the same database and you're talking to the same vendor, it's very appealing to a lot of our customers because just onboarding a new vendor, knowing the caveats and tips and tricks for every vendor and managing that, it's another pain. So another thing we've focused on really hard is how do we make it a really nice experience for users, irrespective of that data model, to have a really consistent experience across the board. So along with that, you also get the really good TCO advantage that the Aerospike platform provides.

So that's a lot, but all in all that covers really the key requirements that are [inaudible 00:44:23] to really good infrastructure systems or databases that power a lot of the fraud detection systems that Bose and Stuart articulated so well. And we're looking forward to have more people try it out and see how this unlocks more opportunities for your business.

George Demarest:

Wonderful. And I would say this has been really quite a successful launch of a new product from Aerospike. We've seen a lot of interest in it, so kudos to the team. We didn't want to over pile on all the product stuff, so I just wanted to give people a little bit of an idea of where they can go from here if they want to get more information about the product or the challenge.

So the first thing you could do, you can learn more. There's a great product page on Aerospike Graph. There is a demo that you can see. You can read the documentation, if you are in learn mode. You can also try it yourself with a 60 day free trial license, so you can download it and get started right now. Or if you are more advanced in your pursuit of Aerospike Graph or graph databases for fraud, we recommend you get in touch with us and you can test it at scale. The ability of the Aerospike database to scale is a bit legendary, so give it a try with your data, explore proof of concept with the team, and contact us.

So with that, I'm going to just go back to Bose and to Stuart one more time with a question for each before we end. And first for Bose, I'm going to go, and just from what you have learned about Aerospike Graph and talking to customers, what can it mean to have a product like Aerospike Graph on the market to people that are day-to-day dealing with fraud in their business?

Subhashish Bose:

Sure, George. So what I would say is that a lot of these conversations, the way we are having it with existing fraud strategy and fraud technologists within the organizations, is that they've all been looking into this. So graph is something, which is a very fascinating idea, but then they've been struggling. They've been struggling because they can't really have a real-time graph database actually power their real-time fraud detection.

It's more historically the competitors and the vendors out there who have these graph products, they're more akin to a near real-time or a batch mode offline investigative use case. So having this capability come in your real-time authorization leg, that is something which is very exciting. And with that, we are having a number of conversations, not just in financial services, but in other industries like telecom, eCommerce as well.

George Demarest:

Wonderful. Well, I want to leave the last word to Stuart. Stuart, can you talk about what this means industry-wide? What does the future hold for those who are dealing with online fraud?

Stuart Tarmy:

Sure. Well, what's happening, and we talked before, I talked about some and Bose did also, about how fraud, and Ishaan also, about how fraud is growing. The fraudsters are getting more and more sophisticated and getting more bold. And I think what you see in the fraud world is that the firms that are using the best technology, the best algorithms, are winning for two reasons. One reason is because their fraud defense is better. And so it is blocking people better. The other reason is that if you were a fraudster, you're going to go to the most vulnerable place to get into, the easiest places. And so those firms that are well-protected, the fraudsters learn that and they go elsewhere.

And so what I would say, what are the long-term trends? I think fraud will increase. I think the firms that do it better will end up winning, will do much, much better in the marketplace competitively because they block it more and the fraudsters get frustrated and go elsewhere. And so if you are not adapting the best [inaudible 00:49:28] technologies for fraud, you're going to quickly fall behind to those who do. And we're seeing that in the market, that the firms that employ technology better are gaining disproportionately market share and profitability. So I would encourage those who are lagging to think carefully about what the implications are if they continue to lag versus the more progressive firms.

George Demarest:

Wonderful. Well, that does bring us to the end of our webinar. I want to thank everyone for joining us. This has been Leveraging Graph Databases for Real-Time Fraud Detection, and I want to thank our speakers, Stuart, Ishaan, and Bose for their participation, and we're going to end there. Thanks very much.

About this webinar

Any financial institution, e-commerce company, or service provider with a serious risk of fraud has invested in a portfolio of countermeasures, including transaction monitoring, risk profiles, rules engines, and machine learning. But organizations with the most to lose must counter increasingly sophisticated fraud attempts with increasingly sophisticated technology, most recently in the form of graph databases.

Join this interactive panel discussion on the relevance and practical application of graph data, graph databases, and associated techniques for battling advanced fraud.

Speakers

headshot-Stuart-Tarmy-500w-150x150
Stuart Tarmy
Global Director, Financial Services Industry Solutions
ishaan-biswas-41c83f272f74e43a110ea3f9105d1422
Ishaan Biswas
Director of Product Management
subhashish-bose-headshot
Subhashish Bose
Director, Financial Services Industry, Asia Pacific