How real-time data and vector can unlock AI/ML apps
Steve Tuohy:
Welcome to today's webinar. My name is Steve Tuohy, I'm the Director of Product Marketing at Aerospike. I'm excited to serve as moderator for today's discussion: How real-time data and vector can unlock AI and machine learning apps.
Before we get started, the usual housekeeping items, today's going to be mostly discussion, so this PowerPoint slide, maybe one other, but mostly it's going to be discussion, not a push of a presentation. We will incorporate your questions as we go, or at least we will attempt to do so. So use the question tool in Zoom, which should be hopefully self-explanatory. I'm sure most of you are on Zooms a lot of the day. Go ahead and submit that, and anything we don't get to, we'll push to the end or try to follow up with you afterwards if possible. You are muted and you're going to remain muted so that's the main way to communicate with our speakers.
And so onto those speakers. We're really thrilled and honored to have Forrester's Mike Gualtieri here as our guest speaker today. I've tracked Mike's career, really impressive. Mike's research focuses these days on artificial intelligence technologies, platforms, practices that make software faster, smarter, and transformative for global organizations. He advises leaders around the world on the intersection of business strategy, AI, and digital transformation. And his background is as a practitioner, so from writing code to managing development teams to architecting complex systems. And one of those I read in here was developing an AI-based autonomous robot arm simulator for NASA's jet propulsion lab. So we'll see if that comes up in today's discussion. So thanks Mike.
And then on the Aerospike side, Lenley Hensarling is our chief product officer. Lenley also brings great experience to this conversation. He's got over 30 years of experience, engineering management, product management, operational management. He's done it at startups, he's done it at large successful software companies, both enterprise applications, infrastructure software. These days, he focuses on real-time data and the applications.
So we are going to dive in and tackle the latest and greatest in artificial intelligence and machine learning. Mike, I'm going to kick it off with you. Thanks for joining us. So everyone has not avoided all the news on ChatGPT, generative AI, so it's an amazing amount of innovation. We're a little over a year into the gen AI revolution, if you will. So the focus today is on enterprise adoption of AI. And so I know in your role, and Forrester in general has a front row seat to this. So just going to be open-ended and let you give a brief intro of your recent research and the background you bring to this discussion.
Mike Gualtieri:
Well, I've been covering AI for well over 10 years for Forrester and writing research on it, researching best practices, researching the technologies that have been used. And if you asked me this before ChatGPT, I would've said, "Oh, AI is hot. Enterprises are adopting it, they have it in their strategy." And then boom, gen AI comes, and now it's like, "I don't know, is it super hot? It's very hot. Now it's hotter than it was hot." So the concept of AI wasn't new to a lot of enterprises, I think they had seen enough successes and enough use cases. And we call that type of AI predictive AI. And the generative AI, technically it's predicting a sequence of tokens, words, sentences, but the generative AI set really sort of ignited imaginations, especially in the business and in the world in general, as you mentioned. So a lot of conversations about gen AI. My research focuses on, I get asked about use cases, I get asked about platforms, I get asked about technologies, complimentary technologies and architectures to build AI solutions.
Steve Tuohy:
Fantastic. You caught me lowering my seat, so apologies for that. Awesome. And Lenley, you are talking to customers all the time as well, and fortunately Aerospike has many of these enterprises that are tackling this evolution as well. So how are you seeing that from a high level in terms of recent AI adoption and interest?
Lenley Hensarling:
Yeah, I'll say sort of the same thing Mike has said that what we call classic AI, essentially predictive AI, we've coined the term classic AI, meaning the separation from the transformer and attention based ChatGPT neural technologies essentially, right? A lot of our customers have been using this for a long time in data pipelines and moving the data around and getting that in as near real-time as possible and really focusing on applying the features that are generated out of ML at the edge for real-time inferencing, real-time decisioning, both in terms of managing customers, fraud detection, recommendation engines, but also in dealing with more decisioning systems, which are machine-to-machine even. So it's not that fairly new but-
Steve Tuohy:
No, I think there's-
Lenley Hensarling:
As said, it's molten. The whole gen AI thing has captured the imagination of the world right now.
Steve Tuohy:
Yeah, yeah. I think there's a sentiment with both of you that, "Hey, AI was hot. We've been talking about this. So yeah, it's super hot now." I've been telling my parents what I do for years and six months ago my mom asked, "What's going on with AI? What is that?" I'm like, "Haven't you been listening to what we've been working on?" But yeah, now it's front and center. So we want to do a little sort of 101 on some of the new concepts, so large language models and get some of the definitions out there. A lot of our audience will have familiarity with this, but to level set. So vectors in particular are something we're going to delve into in this conversation and embeddings. So let me throw this over to you, Mike, to give your accessible definition of a vector and what that's going to mean here.
Mike Gualtieri:
Well, I mean, when you think about generative AI, I think most people understand, "Okay, these models have to be trained on some data set." Normally it's an enormous data set. But that data set is usually written language. It doesn't have to be, it can be code as well, but let's just go with the written language. So let's put Gulliver Travels in there, let's put AC/DC lyrics in there and everything else that we can, everything on the web, and that's all in the form of words and sentences. So these models are mathematically, fundamentally, this is math, it's mathematic. So when everyone's throwing the wrong term, "Well, it's vectors, vectors, vectors."
A vector is just a prerequisite to training of these models where you take that text and you essentially convert it to numbers, and you organize those vectors in such a way that things are near each other, there's a similarity to them. So I mentioned AC/DC lyrics, maybe that would be closer to the lyrics of Def Leopard than AC/DC lyrics would be to, say, Charles Dickens. So the vectors is a mathematical way of storing that information computer.
Steve Tuohy:
Well, we all need to search on Dickens.
Mike Gualtieri:
It's a unique data structure. You need a data structure to do that. It's a unique data structure, which is very different from predictive AI where that's not really the data structure that you use for predictive.
Steve Tuohy:
Thank you. So when we think about the database then, and we'll be hitting the notion of a vector database, how does that fit in the old world of databases, the RDBMS, the NoSQL database?
Mike Gualtieri:
Well, I mean now with generative AI, everyone's throwing around, "Oh, we need a vector database," yet another specialized database. So yes, you do need a vector database, you need a vector database capability. But one of the trends that we've been following at Forrester for many years, I mean maybe as long as I've been covering AI, is the whole notion of a multi-model database where it's like, "Oh, yes, I've got tables, I've got documents, I've got XML, I've got blobs, I've got all these different stores." And so what happens? "Oh, now I have 17 different databases that have introduced latency," et cetera, et cetera, right? So the notion of a multi-model database in the context of a vector, you would look for a database solution that could store vectors. And I think some vendors have that, a lot of vendors are looking at it, some strategically don't plan to do it. There are a lot of specialists in this space, there's some open source as well.
Steve Tuohy:
Great, thank you. Quick question here I think we can insert in. So you talked about multi-model, there's a related notion of multi-modal, sometimes they get used interchangeably. So question is, you're talking LMS, large language models, about gen AI from a language context. What about images? Do vectors help with images as well?
Mike Gualtieri:
Yes. I mean, it's all data that needs to be processed and where you need to find similarity, so yep.
Steve Tuohy:
Yeah, same notion, great. And Lenley, I'm going to push it over to you. We're going to talk a little bit about RAGs in a second, but before we go there, did you want to add anything onto Mike's comment about multi-model databases?
Lenley Hensarling:
Yeah, no, I think it's important to point out that it's not just storing vectors, which is sort of a multi-dimensional array of data points that represent either language or images or whatever, but it's the similarity search capability and the indexing for that where the magic comes in, if you will, and where the differentiation between vendors is going to show. Because I think that what we are looking at is really the challenge of doing this at high throughput so that you could have millions of searches a second coming in against your repository that you might use for retrieval, augmented generation and session. We'll get into that a little bit later, but the high throughput nature and the ability to handle that scale out, if you will, as everybody on the internet comes to your door. I love somebody once said, one of our board members who has been a lead architect of a number of very large companies, said something about the non-linearity of the internet, meaning that you don't know how many people are going to come to your door at any given minute, right? And so you have to be able to cope with that. And that's one of the things that we really tend to focus on, that ability to maintain performance at scale.
Steve Tuohy:
Great, thank you. There's a couple more questions that have come in. I think we're going to hit them, so keep the questions coming, thank you everyone. But I'm going to push onto that notion of the RAG, Lenley. We've talked about it, there's lots of communication about this retrieval augmented generation. So give us the overview on that and how it intersects these topics.
Lenley Hensarling:
Yeah, I like to think of this in terms of contextualization. And when you talk about retrieval augmented generation, it means that you have data specific to your company, to your enterprise or to the business problem that is being addressed and you have to contextualize things because the LLMs are the foundational models, if you will, wind up being something that's trained on this vast amount of data. So the analogy I use is when you call in an expert, a consultant to your company from McKinsey or Deloitte or Booz Allen, they know a ton. They've seen many, many, many different companies, but they may not have been in detail in yours, right? And so they can tell you in general what might be a good path, but they don't have the specific information.
And so when you apply these foundational models, it's like that. It knows a vast amount and it can respond to whatever you ask it, but to contextualize it, you have to supply some of your own data. And that means taking embeddings that you've generated, we said it's a map between either visual information or textual information, and then being able to feed that back into the model that you're using and sort of hone that to the context of your business or the question at hand. And the application of that is what they call retrieval augmented generation, meaning that you have a vector database, you have the ability to create the embeddings, as they say, and the embeddings are essentially that map from the document that's just pages of a document into a mathematical representation of it that can be stored, right? And so that contextualization is really what we're talking about with RAG or retrieval augmented generation. I don't know, Mike, do you want add anything to that?
Mike Gualtieri:
Well, yeah.
Lenley Hensarling:
You've had a lot of inquiries on this, I'm sure.
Mike Gualtieri:
Yeah, because I think there's the real-time nature of that too. Because if you take, I know e-commerce example, I'm making this up, but say you have an e-commerce, you have someone's shopping cart, right? And they press order. Well, you could put the cart as context and say, "Okay, now send the order confirmation to the customer and describe the cart." "Oh, it looks like you're working on a plumbing project and some random knitting project," or something like that. So I think that's another consideration too, because in that case, not only do you need fresh information, real-time information, but you may also need it at scale because this whole shopping experience was motivated by some promotion so now all of a sudden you've got hundreds of thousands of concurrent people hitting this and needing to do this.
Lenley Hensarling:
Exactly, which leads to some of the questions about when you're using RAG, you want to be able to have what people are now calling semantic caching. So if you were doing a promotion, you would try and cache a lot of the vectors you might need to describe what's on sale, what's the promotion, and even the context that you posit that you might be selling into, as you pointed out. And so all of those things have to be able to be applied to wide scale. And I think that's what everybody's struggling with now.
Mike Gualtieri:
And I think lot more people need to think about RAG at scale, because I've talked to a lot of companies over the last six months, insurance companies, financial services, they're downloading some stuff from Hugging Face and messing around with a little RAG project, "Oh wow, this is cool. I got Llama 2, I'm doing a RAG project, look what I can do." But that's a little experiment, right? If you then start to think, "What if I was going to do this at scale for my entire organization," even if I had 50,000 employees internally or tens of thousands or millions of customers, then you have to start thinking like an architect again about, "Okay, where are all these bottlenecks? Where is latency introduced? And what happens when I start doing this at very high concurrency and need real time RAG?" And then you have to start thinking of the components and how to build that architecture.
Lenley Hensarling:
Yeah, exactly. And I'm glad you said that, Mike, because we haven't released our vector product yet. But what we're focused on is exactly that, because that's what we've done for the application of classic features, if you will, at the edge. And we've focused on that and the ability to scale out and be elastic with the search capability and to meet those demands of hundreds of thousands to millions per second of these vector queries that are going to have to be applied.
Steve Tuohy:
Good stuff. I'm scribbling notes here and trying to incorporate some of the questions. So challenges, right, I haven't heard hallucinations, but I think we know about that. But context for one, adding on context through a RAG so these advances can be more usable. Scale, I think Mike brought up and real-time. And then the notion you're piecing together different parts of an off-the-shelf LLM for instance, and your vector database. So let's think about if there are other challenges, but I want to bring in an audience question here. Okay, so there's been a lot of gen AI services like Aerospike for dev teams to be innovative, ChatGPT was an 'aha' moment for the masses. In the IT space, what end user tools do you see, which are gen AI savvy? So a random tool, a Tableau, do you see that being a vector query or RAG savvy? So I'll let you interpret that. I'm hearing this a little as kind of the stack and putting things together. What are your thoughts?
Mike Gualtieri:
Well, I mean to me it sounds like the question is about where are we're going to find gen AI because you mentioned Tableau. I mean, every software vendor is trying to figure out where gen AI works, helps within their product. And so a lot of companies kind of rush to try to figure out the technical details of how to do this and what sort of talent do we need? But then they're quickly getting messages from their business software vendors saying, "Hey, we have something coming, it's going to make this easier." So it's not going to make sense to customize gen AI for every application, it's really wise to actually scan your business software vendor landscape to see, "Oh, do I need to build some contraption here? Or is salesforce.com going to have a gen AI feature there?" So I think the way tech execs are thinking about this now is more the way they think about software, meaning, "I need to figure out what's in the market, I need to figure out what I have to build myself, what's differentiated or what's so gnarly to integrate that I have to do it myself too?"
Steve Tuohy:
Lenley, anything you want to add?
Lenley Hensarling:
Yeah, I was going to say that, I think this is a pattern we've seen before where at first there's just technology and some of the leading companies are going to do some things themselves and embrace this, but soon there will be packaged gen AI solutions by most of the application vendors. I spent years in the ERP business and I can see where financials and being able to ask questions about it is going to be added there. And being able to have things like, "Can you get questions asked about your 10k that you post for investors?" Right? And those will be incorporated into the financial systems and a way to handle RAG, or retrieval augmented generation, to make sure that you're not veering off course into hallucination space when you answer those questions because there's going to be liabilities around those kinds of things. And I think those will be handled in the applications to a great extent, I think.
Steve Tuohy:
Awesome, thank you. Okay, so you get vendors like us and analysts out there talking about the promise of all this, so we assume all these customers are up in production with these different use cases and changing everything. So let's transition to some of these use cases. And in particular, we talk about challenges, where are customers hitting walls and where are they reaching successes? What are the use cases that are ready for success? I'll let either of you take a first stab at that.
Mike Gualtieri:
Well, maybe we can go back and forth. The one that gets the most buzz is just for personal productivity. And there's the coding, right? Because there's a lot of code assistance. It's not just human written spoken language, it's computer language. So I've talked to a remarkable number of companies who are just at least messing around with it, and there was a lot of controversy, "Oh, should we let our developers use this?" That was kind of like saying, "Should we let people use the internet or not?" Except this cut loose a lot faster, so that's productivity.
But then you have Microsoft's Copilot landing inside the productivity apps, and you have a lot of impressive stuff out of Adobe for their productivity, Adobe Firefly and everything that they're building in. So I think that is the biggest use case. And at Forrester, I cover more of the technical architecture side and the build it yourself, but we have a number of analysts who cover workforce productivity who aren't AI analysts, but most of their questions now are fielding about how companies are using this for productivity. So that's one. I have more, but Lenley, do you want to add one?
Lenley Hensarling:
Yeah, no, I think that one thing I'd say is that, and in part there's the gen AI use of the whole encoding mechanisms that are behind vectors, but people are starting to apply that in classic AI as well as a sort of richer feature set. We have a lot of AdTech customers, and they're seeing ways that they can just have a richer feature that's generated by their ML, and then they need to apply it very fast with a similarity search that can execute really fast. And so we're seeing things like that as well.
And so I think there's going to be a lot of creativity in applying some of the components that are coming out of gen AI, if you will, and recombination almost, if you will, right? Because I think that generating patterns or recognizing patterns and more IOT-based data is going to happen as well. And there's work going on to be able to generate embeddings or the encodings as vectors of more almost time series kinds of data, and then being able to look at those patterns in a much more richer way and do it a lot faster than they might've been in the past.
Mike Gualtieri:
There's a digital assistance for customer self-service. I mean, that's an example of where, okay, chatbots for self-service were already a thing, but now companies say, "Whoa, this can be a lot more contextual, it's lot more articulate." So they want to infuse that. Next generation search, right? This whole thing we've been talking about RAG. I think the remarkable thing about these models is how articulate they are. And that's part of the danger though too, because many people confuse a well-spoken person or a well-spoken model with accuracy and truth. And that's some of the challenges too with this.
But other use cases, I call it generative engineering, and you've probably seen, many people have seen using it for chemical discovery because there's a language to every engineering discipline, there's a language to chemistry, molecular structures, life, DNA, and even auto parts, artifacts from AutoCAD and parts list and so forth. So there's a lot of people thinking about this not just from a human language, but from a engineering language, and how can that add to the creative process and design process?
Lenley Hensarling:
And Mike, I think what's going to happen there is that the low hanging fruit is personal assistance, being able to answer questions and being able to scale that out in a way where you can't just engineer people to take phone calls when the airline gets hammered, but you can generate new digital assistance, if you will, to take people's calls and deal with their questions and even reach back into the system and take actions, right/ so I think that's the low hanging fruit. But the real value is going to come when we start seeing this apply to supply chains, to product development and things like that. And I think that means more specific models, and I think it's worth touching base here, and a question to you, are you seeing this movement towards what people are calling SLMs or small language models? But I like to say specific language models, if you will, right?
Mike Gualtieri:
Yeah, that's better. Actually I hadn't heard the specific.
Lenley Hensarling:
I just made that up right here.
Mike Gualtieri:
Okay. But no, I mean, I think this was a myth from the start largely pushed or helped by OpenAI and even Microsoft too. It's like, "Oh, there's no way you could ever build a model this big because it costs us zillions of dollars to build this." And so everyone thought, "Oh, okay, well, I guess we'll just have to use that model." But you're right, there's going to be tens of thousands if not millions of models, some will be fine-tuned. But if you look at the activity on Hugging Face, huggingface.com, which if anyone's not familiar, it's kind of an open source repository of AI type models like Llama 2 and many others, right? So yeah, I mean, there's going to be a lot of these model.
And I'm already hearing companies that would say, "Well, I am never going to build my own model," to saying, "Well, we're looking at it or we're going to fine-tune." I mean, RAG is a wonderful technique there for many use cases, but like anything, there's a spectrum, a range of use cases. And if you look at Mosaic ML, which tries to optimize the training of models, they've got, I forget a huge model that used to cost $450,000 to train, they have it down to, I don't know if it's 50, 60,000, but orders of magnitude less. So I think yes, we're going to see a lot of these models.
I love the possibilities with AdTech too, because the way AdTech works, they're constantly testing. So in some ways it's going to be the frontier because you can start doing some crazy hypotheses about prompting and generative AI, use AB testing and see what hits, but that could change the entire AdTech industry.
Lenley Hensarling:
Yeah, absolutely.
Steve Tuohy:
Yeah, the language piece is very tangible thanks to OpenAI and other LLMs and some of the examples you've had build off that, and RAG obviously builds off that. You've touched on this a bit, but I'm combining my own question with one that's come in off the chat, but let's take fraud for instance, right? And Mike, you talked about the real-time nature and the scale needs. This is not asking ChatGPT, "Is Steve at Forrester at the point of sale?" Right? As fast as it is, we're talking about millions of people transacting and capturing that information and acting upon that. So Mike, you distinguished language and predictive upfront.
Mike Gualtieri:
Predictive versus gen, yeah.
Steve Tuohy:
Yeah, gen versus predictive and acknowledging there's some intersection. So the question that came in that you can build off this or go beyond, but isn't fraud handled fine pretty well, they said, with more traditional models? So how would vector help? And my own insertion is, do you need an LLM in that or would an LLM even be additive?
Mike Gualtieri:
Well, people are definitely researching and figuring out how to use LLMs because there's different types of fraud, right? And sometimes a LLM would be perfect at fishing out a phishing attack, for example. So companies and vendors of fraud detection software are definitely incorporating that based upon different attacks. But you know what? You can also use a gen AI model for classification just as what predictive AI does, right? Predictive AI takes a data payload or something, a transaction and maybe some enhanced data, and it says likelihood of fraud, low, medium, or high, right? But some of the gen AI models can actually do that as well because it may not just be transactional data, there may be some sort of interaction data like it may be the chat information.
And the other thing to think about is just like we want to detect fraud and the world's gotten, I don't know, better at it. Let's not say it's good because there's still a lot of fraud, but it gets better at it. Those same techniques people are looking at for what's called guard models. So the output that comes out of a gen AI, is there anything we could do to look at that output to perhaps detect a hallucination? So you can think of a hallucination as fraud, the model being fraudulent, and so you can use some of those fraud fraud techniques to govern the model.
Steve Tuohy:
Lenley, your thoughts? I know some of our customers, you've talked about the classic models and an enhanced feature approach so a lot of our customers are doing fraud.
Lenley Hensarling:
Yeah, I would say the discussions we've been having are that because there are patterns in language that indicate fraud, the way a conversation progresses and such, and they can start to save some of those conversations off when they say, "Some of this may be recorded," I think there are going to be new applications for that definitely. And then they'll be able to mash those patterns back and do similarity search against what's coming in and be able to say, "That's one more signal. And it's not the only signal, but it's one more signal to fraud." And I think that you're going to see that. And a lot of this is just refinement and refinement and refinement in that game against fraudsters.
Mike Gualtieri:
Yeah. And I mean, we should mention that this is one of the most wonderful tools that was invented for committing fraud too. So I have a prediction, cybersecurity companies will increase their revenue by three times next year as the criminals start to figure out how to use this. I mean, seriously, it's an incredibly scalable tool for committing fraud as well.
Steve Tuohy:
Very good. Okay, just looking at the clock, we're going to shift forward here a bit on the infrastructure side. We've got a question that touches on data models that you guys... Okay, so is there an ideal data structure for predictive AI and any limitations? Anyone want to..."?
Mike Gualtieri:
Well, for predictive versus gen AI, I mean, you got to think of three workloads. When anyone says AI to me, I'm like, "Oh, there's three workloads." There's the data prep workload, there's the training workload, there's the inferencing workload. Inferencing is sometimes known as scoring, but it's using the model, right? So first you need all the data needed to train the model, and then once you train the model, then you need data and you need to be able to inference that model.
So most predictive models for training is a table. So the data prep takes all of this data in whatever form it is and it flattens it out and it makes columns. And I mean lots of columns in some cases, could be 2000. It could take 30 columns from data structures and it could blow that out to 2000. Why? They're doing feature creation, they're taking ratios, they're taking a date and splitting it up into month, day, and year, quarter. So you can imagine how quickly you could sort of blow out those columns. And that's for the training workload.
Now, once the model is created, the model determines, "Well, I need these six variables on the input to predict the thing you're trying to predict." Those six variables could be in different locations and different formats. So now you've got your second problem, which is during the inferencing stage to retrieve enough of that data at scale to call that model. Because people, they worry about the latency of the model, a lot of times it's not the model, but it's getting the reference data and the other data needed before you can even call the model. That's normally where your performance problems are.
Lenley Hensarling:
And Mike, this reminds me of one of our star customers, I guess, LexisNexis and the product they have ThreatMetrix, right? And so the CTO there, Mattias, one time I asked him, "Why do we matter to you?" And he said, "Because I used to be able to apply a stream of signals that were hours worth against weeks worth of data, and now I can take weeks worth of input that's aggregated and then match that against months of data, and I get a higher fidelity result." And then they actually charge people more for that in their fraud detection because it's a higher fidelity result, right?And so it's just the application of more data all the time. And I think that the same thing is we're hearing from customers about how they want to use gen AI.
And I think the other thing that's happening too is that we always say we can do that in a cost-effective way, but there's also this notion of semantic caching. And so you're not going to have to go back to the foundational model for every question, you can sort of check, "Have we already sent that back? Do we already know the answer in terms of vectors returned? And then can we drive things internally and be able to disintermediate some of the cost, if you will?"
Mike Gualtieri:
Yep, that makes sense. Because even though it could be a fraction of a penny or a couple pennies a call, it adds up for some use cases, and so caching makes sense.
Steve Tuohy:
Great. Hey, last category, probably the last four, three minutes is advice. And what's the path for getting started for organizations? What do you see people tackling first and how should they be thinking about costs and so forth?
Mike Gualtieri:
Want me to go first?
Steve Tuohy:
I intentionally left it open, whoever wants to jump on. Sure, but yeah, if you're up for it, Mike.
Mike Gualtieri:
Yeah, so I mean, use cases are not too difficult to find because you look through any business process and you locate the opportunity for some, where you're generating content or something. And it's likely now that whatever your industry is, that there's been some common use cases that have been done by others and have been vetted up.
But once you find that use case, you have to think about how you're going to implement that at scale. And I've largely already said this, but there's a lot of developers now, because one of the things about gen AI doesn't particularly help you to have a lot of statistical knowledge like a data scientist because all that stuff is abstracted, it's kind of done for you. So it's a lot of developers messing around with this stuff when they see a use case and it's very accessible to do a simple use case, but gen AI use cases are going to happen so you have to pause and think about how you're going to scale this, how you're going to architect that. Because making an API call to ChatGPT is very simple, making an API call to anything. But the architecture to get it to perform and to get all the data, that's the hard part. So whoever is experimenting on the use case, just like inputs and outputs, there should be another team working on how it's going to work out into the architecture.
Lenley Hensarling:
Yeah. I'll add one other challenge I think that we've had with AI ML all along, and we're making good progress on until the introduction of gen AI, I would say, and we continue to make progress on the predictive side, and that's explainability of results, right? And with gen AI, it's a particular challenge because the system doesn't lend itself to that because it's not a step-by-step kind of thing. It's doing this generation and finding the next available thing and the track can be very long and trying to say, "Well, how did you come up with that?"
And I know there's work being done, but I think that that's going to be a particular challenge in deciding how to apply it. I know there's work going on now to apply gen AI to diagnosis for medicine, and it's like, "Well, how are you going to say, "How did you come up with that diagnosis?" Because it's going to have to be checked and there's going to have to be tracking of that kind of thing just because of the nature of that sphere." And there are many other areas that are going to be similar. And I think that that's going to be something that's going to evolve over time.
Mike Gualtieri:
Yeah, and another issue with that Lenley brings to mind is you never quite solve it too, because just when you solve it, you want to retrain it with some newer data.
Lenley Hensarling:
That's right.
Mike Gualtieri:
And so you've got to have kind of this testing process and because a lot of this is dealing in probabilities and you can't be 100% sure on some of the things, some companies are employing AB testing strategies to this or what's in financial services champion challenger, right? These are processes that maybe a lot of software development team, many software development team like advertising and other things are very familiar with these concepts, but a lot of software, it's like, "Okay, we have these new bits, we're going to test them completely and when they work, we're going to push them out," right? But with some of the Gen AI models and your prompting strategy, you can't be 100% sure. So some companies are saying, "Well, I can't wait until I'm 100% sure because I'll be waiting forever, so I'm going to put this out to 10% of the traffic, 10% of the transactions, and I'm going to see." So it's a risk mitigation strategy, but it lets you move forward with a degree of uncertainty. And I think it's counterintuitive, but businesses in highly regulated industries are actually pretty good at risk management and assessing it because they've been dealing with compliance forever.
Lenley Hensarling:
Mike, that brings to mind that many people are now talking about augmenting this not with gen AI RAG, but with other types of search. And so once you get a result, going and testing it against just looking up facts, right? Applying them, see if they fit. And then if they do, that acts as a risk mitigation, and there's of quality of the generated content, which means accuracy and likelihood of hallucination and other things, which can be dealt with by just search capabilities of a different kind.
Mike Gualtieri:
Yep.
Steve Tuohy:
Gentlemen, I regret to say we're about at time, so I'm going to pull up our thank you, goodbye slide. But that was a pleasure for me. I hope our audience gained some new insights, I'm sure they did. I did. Mike, on behalf of Aerospike, thank you for being our guest today.
Mike Gualtieri:
Thank you.
Steve Tuohy:
Really great perspective. So for those of you who want to dig deeper on Aerospike here, just a few resources. I'll give the guys a second to give final thoughts, but yeah, dive in, set up your database with Aerospike. Lenley mentioned, I think he calls it classic AI, but among that is using Aerospike as a feature store so we've got some content that you might be interested in on that little discussion on the use of vector databases that our product manager Adam Hevner put together. And then a piece on real-time AI from Aerospike's Chief Scientist, Naren Narendran. Any final thoughts, Mike, Lenley?
Mike Gualtieri:
Well, we've had our fun thinking about how AI is going to end the world, but now we've got to get building scalable applications with it, we've got to get beyond the experimentation phase and just start building this the way companies already know how to build real-time scalable applications.
Lenley Hensarling:
And I love that that's the way Mike put that because that's our focus and what we're building in terms of our vector database capability, the ability to scale out and do that while maintaining performance and low latency in responses, because people won't wait around just because it's gen AI.
Mike Gualtieri:
Right.
Steve Tuohy:
It won't be the hot new thing forever. All right, guys. Well, thanks everyone for joining. Good questions that came in, and we'll have a replay available. Have a great day.
About this webinar
Machine Learning (ML) has long been at the core of real-time decisioning for mission critical applications like fraud prevention, customer 360, and recommendations. However, with the advent of Large Language Models (LLMs) powering Generative AI (GenAI) applications, the world is left to ponder how to best employ the advent of this generational technology.
Real-time, multi-model databases are the key to unlock both current and pending technologies. To-date, real-time databases have been able to stream in, store, and access data. Data modeling techniques allow developers to “bend the data” to their will, be it key-value for fast lookups, document for rich online applications, or graph for discovering connections. Vector is merely the newest entrant to participate.
Learn how cutting-edge organizations are leveraging advanced databases from Mike Gualtieri, VP Principal, Forrester, and Lenley Hensarling, Aerospike’s Chief Product Officer. We’ll look at how real-time application designs are helping customers navigate today’s volatile market pressures and more.