WEBINARS

New Frontiers in SQL: Using real-time data and document databases in a Data Mesh

Video cover
You can view it at https://vimeo.com/844328121

George Demarest:

Welcome. Good afternoon, good evening, good morning, wherever you are. This is George Demarest from Aerospike, and thank you for joining us for a webinar entitled New Frontiers and SQL using Real-Time Data in a DataMesh. With me today are three esteemed speakers, one is Adrian Estala, he's a VP and field CDO at Starburst, also has long experience in the customer base. We have director of product management, Ishaan Biswas from Aerospike. And Ishaan is deep into our product, he's in the product team as I am. And then finally we have a special guest that is Mike Gualtieri, VP and principal analyst at Forrester Research.

Mike is going to give the perspective of what the market sees, what the Forrester Research sees, and he talks to a lot of customers and a lot of vendors. So, Mike will provide some important perspective on DataMesh and data products and other topics from today. So, let's get into the presentation. The first thing I want to say, I'm just hosting today, but I'm going to keep the conversation going. DataMesh is a big topic. There's theory, there's practice, there's behaviors, there's tooling, there's a lot of moving parts and we can't possibly cover them in a single webinar.

So, we're going to focus specifically on the Starburst Aerospike partnership and data products in particular. But it is a deep topic and someone who lives in the DataMesh world day in and day out and thinks a lot about it is Adrian Estala. Adrian, welcome, thank you for joining us. Can you walk us through the DataMesh topic and your perspective on it?

Adrian Estala:

Yeah, absolutely. Thank you, George, for the introduction. So, before I jump into it, I think just put the bigger picture into context. I think as you said, George, there's a lot of buzz if you will out there and a year ago, I think it was buzz, there was interest, but there was a lot of concern as to whether this was something real. I think a year later we're starting to see a lot of large organizations not just build a DataMesh but mature a DataMesh. And you're starting to see really fantastic use cases that are working for specific function specific teams. And so, I think we're past the point of wondering whether it works.

Now we're starting to understand how do I make mine work? What do I need to do to get started? And so, when we think about this idea, I love that theory to practice. The focus is on practice and so what I'm going to present today here, I'm going to try first to start with a vendor neutral position. I think it's important to understand a foundation for DataMesh the basics so to speak, and you've read the book, fantastic. If you don't have time to read the book, hopefully in the next 10 minutes I'll give you the basics you need to start understanding some more of the detail.

But when we think about basics and talk about the concepts, first thing I want everyone to understand is a DataMesh solves problems on the left side of this page for the architect, but it is mostly designed to solve problems for the people on the right side of this page, the consumer. We often talk about the difference between a data fabric and a DataMesh, and to be fair, they're very similar. If we think about the architectural issues we're trying to solve, the idea of a decentralized architecture connecting to all my data where it sits today. Whether it's NoSQL or whether it's a lake or a warehouse or something else.

That decentralized approach, a fabric and a mesh are both intended to solve that in the same way for the most part. We can get into the details, but the objectives are the same for fabric or mesh on the left side of the page. Where it starts to differ is on the right side of the page, the issues that we have as consumers. These are data scientists, data analysts, and they're also think about the consumer also from a business perspective, business consumers trying to solve big business problems that are dependent on data. They're struggling on the right side of this page to get the data they need to make the decisions or drive the automation or to even drive some bigger ambitions like ML operations or AI.

And so that's the problem we're trying to solve when we start thinking about a broader DataMesh. You've heard people talk about it's more of an organization and process approach than a technology approach. That's true, but I think it requires all three. It requires an understanding for how we fix the architecture and understand for how we balance process and then a deep understanding for the consumer, the people if you will, we're trying to serve. Let me explain how that works. Great. This is the 30,000-foot view, and so this does not replace the book, but again, I always like to give people a foundation, a place to get started. And again, if you can understand the foundation, then it creates interest for you to read the book and then you can go and read the book and get all the details behind the scenes.

So, I'm going to miss the details, here are the big basic steps. For every DataMesh, phase one, the first step you're trying to solve, the big problem you're trying to address is connectivity. You want the ability to connect to my data sources where they are. We often talk about avoiding migration or we might even say something like, don't migrate. The idea here from an architecture perspective is you don't have to migrate. If your first step in building a DataMesh is to migrate data, then you might want to rethink whether or not that's the right approach.

There are good reasons to migrate, don't get me wrong, there are plenty of them. What we're saying is when you build a DataMesh, you'll never be able to catch up if you're trying to centralize everything. When you're building a DataMesh, principle number one is let me try to access the data where it is, full stop. And if there's another reason to migrate it, let's do it but that's not a rule, that's the exception. Here's what happens, whether if you've already built the fabric, you've realized this, if you're starting to build a mesh, it's this connecting to those data sources while it solves an incredible architecture problem.

And to be fair, people get promoted, I'd work with a lot of customers that that's the only step they've taken so far and they're winning. They've achieved the objectives they're trying to achieve just with that connectivity. But here's what'll happen, you are still going to continue to struggle to deliver data to your consumers. Great, I've connected to all my data, but your consumer is still going to continue to create demand. I need data for dashboards, I need data for analytics, I need data for automation, I need data for applications. They're going to continue to request for you on the IT side to continue to transform and deliver that data in different formats.

So, what a DataMesh does, and this is the key principle, we start thinking about the differences between a mesh and a fabric. What a DataMesh does is it says, "Hold on a second, let's create logical domains." I'm not moving data into a domain, I'm creating a logical domain for specific functions. How you define the domain, that's probably a topic for another conversation. There's a lot of different ways to do that, but in a simple sense, think about a domain as a small team that's trying to solve a business problem. You create a logical domain for that team and then you say, "What data do you need? How do you use the data? What type of data are we dealing with?"

Confidential data, low risk data, you understand the data you're working with and you understand what the consumer is trying to do with that data. And then you design that domain or that consumer and you give them the self-service capabilities and you give them the autonomy to manage their domain. You give them the keys. I always say you want the business to drive fast, give them the keys. You build a domain for a team and you let them run it autonomously and you train them on how to use self-service tools so they can start to get their own data.

The jewel of a DataMesh then is a data product, because what will happen in this domain is they'll start to create data products. They will start to aggregate data sets from different sources. They're not going to worry anymore and they shouldn't worry or they shouldn't care anymore about where the data comes from. What they know is when I go to my domain, all the data I need is there. If it's a finance domain, my finance data is there. If it's a fraud risk management domain, my fraud data is there. If it's a customer 360 domain, my customer data is there. I go to my domain, my data is available, I can start to put pieces together to create data products table from there, a table from here, a field from there, a table from there, and I can aggregate these data products.

And here's the most important part about a data product. It isn't just giving somebody aggregated data, what makes a data product special is the metadata, the description, because you're going to describe that data product to somebody. You're going to say, you can imagine a consumer going to a catalog, searching through data products, finding something that looks like it's what they need, taking that off the shelf if you will, and then reading about it. This data product came from these sources. Here's a purpose, this data product should be used for these types of exercises.

Here are the owners of the data product, here's a list of who else is using the data product. That data product should give you a description that tells you everything you need to know so that you can use it quickly. And you can make a decision as a consumer, is that what I want or is it not what I want? You can save them an incredible amount of time, not only because they can use it, they don't have to wait for somebody else to build it, but because they can make a decision whether it's what they really need. Far too often as a consumer, we ask for something, we get it it's wrong, and then we wait another month, get something else it's wrong, wait another month.

And so, that ideation takes too long. When you can go to a catalog and quickly read about data products and find what you're looking for creates incredible speed, not just because of the reuse, because of the simplicity in finding it. Those are the basic principles when we think about how we use or how we build beta meshes, we build them for the consumer. We try to abstract the complexity on the backend, if you will, and when we think about how that works with Aerospike, let's take a look at what that means. Still thinking about two views here. If you're an architect, you're looking at the left side of this page. This is a DataMesh ecosystem. There shouldn't be anybody out there that's saying, and certainly on the Starburst side, we don't say this.

They would say, "We are the only DataMesh provider." No, DataMesh is an ecosystem. Principle number one, as I said a second ago for our DataMesh, is to leverage the tools you have where they are. And so, when you think about that ecosystem as an architect, you want to think about how you integrate all the data sources you have across the bottom of the left side of the page. And then what tools you're going to use to integrate them? You've got lakes, you've got warehouses, you've got Starburst possibly there helping you to cut across all those different data sources.

And those data sources should be available from a consumer perspective. It's not about giving a consumer raw data, that's confusing. It's about organizing that raw data in domains that can be easily accessed via a catalog. And so, you can imagine your consumers going to a catalog, as I just described, they don't care where the data products came from. What they care about is being able to easily find what they're looking for in a single catalog. And so, you could have in a full mature DataMesh ecosystem, multiple data products coming from different sources, that's okay. You want the optionality, you want the ability to optimize your data products in a way that makes the most sense for that domain. We'll come back to that in just a second.

But let me take you to the top, the consumer. At the cross, the top, you've got various different types of consumers that are using the data. You have people building dashboards, fantastic. They're going to be able to take these data products immediately and pull them into a Tableau or a Power BI or a Looker or ThoughtSpot. Whatever they're using, they're able to bring those data products in so they can use them. You also have teams driving more advanced analytics, maybe using some of the same data products, maybe building their own data products. But again, having that ability, that simplicity that very easily done what they're looking for to drive that automation engine, to drive that ML ops, whatever they're pushing.

And so, you're able to use data products not just for dashboards, but for even the more advanced analytics capabilities. That's the way they should work. I want to highlight on the bottom right side of this page, Aerospike. Oftentimes when we build a DataMesh, we go after the easy data sets. A lot of our customers will go after some of the lakes and warehouses. Let's get that going, let's get that easy win. But oftentimes our struggle is not with the easy data sets, our struggle is with those NoSQL databases that present a different kind of challenge, and those are important.

We want to get easy wins, but sometimes that easy win, if you will, or that first win requires you to go connect to something a little bit more complicated. And that's where a tool like Aerospike comes in because it makes it easier, it makes it faster, it makes it from a consumer perspective, you're able to bring that data forward to them much easier. And so, when you marry what we can do in Starburst on the front end in terms of making domains and data products with the technology like Aerospike on the backend with its ability to connect these NoSQL data sources, you've got a real winner so to speak.

You've got the ability then to deliver those complex data sets very quickly to those consumers. On the right side of this page, now I'm thinking about a consumer, I'm talking about the consumer. That's what the consumer should be looking at. They don't care, frankly, to be fair, they may not care whether you've got Aerospike or whether you've got Starburst on the backend. What they care about is that when they go to their catalog, they're able to find it quickly. It's about simplicity. All your consumer will need to understand is I go to my domain, I get my data products.

I can search other data products if I need to, and I can request access I don't have permission. It's as simple as that for the consumer. You let Starburst and Aerospike do all the complex work on the backend so that we can make it easier for that consumer on the front end. Fantastic. George, hopefully I didn't take too much time. Hopefully that was a useful introduction, data measure. We can go a lot deeper, maybe we do that in a different session.

George Demarest:

Yeah. No, it was great, Adrian perfect level. And what I'd like to do is actually bring Mike into the discussion, Mike, to hear what you and Forrester are hearing. What the buzz is, what the reality is, and how do you respond to what Adrian has just presented.

Mike Gualtieri:

Yeah. So, I think Adrian you did a great job describing what the DataMesh looks like and the ecosystem, and this is among the most popular topics at Forrester as well for all the reasons that Adrian just described. A couple of thoughts here, one is the why behind this. Adrian stated how architects like this, but it's really for the consumers. Why do architects like it? Well, because if you think of a large global corporation, they have built up dozens, hundreds of applications all generate potentially valuable data.

And increasingly a layer of modern applications that have to use that data that gets generated from one or more of those systems and increasingly needs it in real time. So, that creates an enormous amount of complexity in order to do that. And there's a lot of these applications and systems that either you don't control or you have to live with. So, this is an entirely necessary technology, so why didn't we do it 10 years ago or five, really?

Adrian Estala:

That's a great question.

Mike Gualtieri:

It's a topic, because we didn't really have the technology. There's been a lot more technology and this is infrastructure, it's software. There's a lot of things that have come together spurred on by the need for it. Adrian, do you have a comment on that?

Adrian Estala:

Oh, yeah, that's a great comment because I think that's something we talk about a lot. I've seen extremely mature DataMeshes in pharmaceutical and in banking. And when I look and study, let's say pharmaceutical for a second as to why they were able to move so quickly. And I talked to my friends, I was like, "What is it you guys did that allowed you to create these domains that are working so efficiently when everybody else seems to be struggling with trying to figure out how to build domains?" Everybody says, "Well, I've got to make all these organizational changes. It just seems like so much work." I think the advantage pharma in particular, the advantage they had was their teams were already organized like domains.

So, they didn't have to make a lot of organizational change. They rolled out a DataMesh process, if you will, a concept that already fit within these domain teams. They were already working like domains, now we just gave them the capability. What's interesting about that, the point I'll add to this is that a lot of the things that we're doing in a DataMesh, they're not new, but for those of us with gray hair, all of us probably on this call, we go back 20 years to how we used to do projects. All the painful things that we've done in our careers, you go back to year 2000 or SOX or initial cloud migration or GSAP integration.

When you wake up at night screaming the projects that causes pain, we learned a lot of lessons from those. But one thing that we did learn that it's different today than it was maybe 10 years, maybe 15 years ago, is that everything's agile now. We don't roll out a DataMesh for a company to say, "Here's a big waterfall project." I used to break Microsoft Project because we had too many row in my projects. We don't do that anymore, everything is agile today. And so, when you think about implementing DataMesh, you build one domain for a team that it makes sense for.

You don't roll out 20 domains at once, you build one. You go create value, you start small, you move fast with an agile approach, that's what I think is different about DataMesh. The reason it works today versus what we've been trying to build in the past is because we've taken all these great ideas like an agile methodology, like a domain centric design, all these things, product thinking, those aren't new they've been around for a while. But now we've taken all those great things that are now proven and we've pulled them together into one DataMesh concept and now it's like, "Wow, we've taken proven concepts, pulled them together, this isn't new." What is new is in trying to change the mindset of that consumer.

Mike Gualtieri:

Yup, great, great add. Great additional thoughts there. And then the additional thought that I had also was thinking about how data products are great and how you become dependent upon them and what do you have to think of? Because now if you're an architect, if you're a solution architect or an enterprise architect, you have to think, "Oh wow, this is great. This data product is being used in my e-commerce system. We're totally reliant on it, it can't go down."

Adrian Estala:

Right.

Mike Gualtieri:

Because you can have data products that feed Tableau or something, right?

Adrian Estala:

Right, spot on.

Mike Gualtieri:

It's interrupted the, data feed is interrupted and so we'll check back in 30 minutes. But if it's some sort of real time machine learning model or something, it cannot go down. So, if you're a buyer of the various ecosystems and the products that you need on this left side, you also have to figure out why they're not going to go down. And the answer that I don't like from vendors because it doesn't make sense to me, is the uptime percentage. Like, "Oh, it's 99.999." Well, cloud vendors say that, but on a large that means they're part of it. But there's software bugs that can make things go down, and there's interconnects that can make things go down.

So, the better question that I think if you're buying this technology is what is your fault tolerance strategy, right? So, yup, it's great that you're five nines, but what if something does happen? How do you recover from that? So, we're talking at the high level of the data product and there's lots of things necessary to make that happen, and we could drop right down into the infrastructure level about fault tolerance strategies as well. So, it's a big and complex subject.

Adrian Estala:

Absolutely. Absolutely. We could talk more and I love that, that's really good. You're bringing up these points like, ah, man, let's go deeper into that and maybe we don't have enough time. But real quick on those, because that's a really important piece, right? How you go ETL for something that requires that tolerance versus maybe something that doesn't. We often see in organizations that they'll build some domains and maybe in that domain all they're doing is dashboarding. Now, the value of doing that is that you can have a lower level of governance. Maybe they're not doing all the testing, the release management is a lot lighter. All they're doing is certain kind of dashboards and they can move really, really, really fast.

And then you have a different domain where maybe they're accessing data sets on the backend that maybe require have some challenges built in like the NoSQL we're going to talk about here in a second. And you have to build a certain level of tests, you have to use maybe different tools. There's a more formal release management process, maybe there's SLAs built into it. And for that domain, you raise the governance because you want to make sure that you're at least building and designing to deliver against that reliability or that security. But that's the beauty of a DataMesh and domain design is you don't have to build it one way for everybody. You design the domain in a way that's going to optimize the results for that specific consumer.

George Demarest:

It also, Mike reminds me a little bit of the discussions of digital transformation, which you start and you never stop, and it doesn't sound like you ever get to the end of the DataMesh. It is a process that evolves and iterates and gets better and better, hopefully.

Mike Gualtieri:

Absolutely.

George Demarest:

Very good. So, at this point, I want to move to the Aerospike piece. And we've been talking about the entire DataMesh subject, but as Mike and Adrian have mentioned, data products are really a really interesting dimension of this. So, let's talk about that and I want to pose a rhetorical question, and that is, what is your most valuable data? Is it your ERP data? Is it your customer 360 data? Is it your master data? And I think the answer has been presented by Adrian that it depends on the domain, that each domain has important valuable data.

And for us, where we live, where Aerospike lives is in that area of real-time data at extreme scale. So, we have a very efficient and scalable kernel in our database that enables sub-millisecond response times in the hundred million transaction per second range in some of our customers. So of course, we can work on gigabytes of data and terabytes, but we do deal with this growing amount of data that IDC projects that by 2025, 30% of all data will be real-time data or data that's used in real-time data products or use cases.

So, just a very quick summary. There are other presentations, there are other webinars where you can learn about the real time data platform, but this is where we live. You don't see us very often, we are definitely in the plumbing and delivering very, very fast data. And the types of data products then that we encounter or that we see are customers inspired to create are things like payment fraud prevention, bidding and trading systems, and especially in ad tech and in media companies. Personalized recommendations which are becoming more and more of a real time problem that people want their systems to understand them second by second, minute by minute and not have to wait for anything as Adrian suggested.

We have several customers that are doing digital payment. So, super mission-critical stuff, hyper personalized customer 360 is another common one. And risk analysis, actually, when I introduce Ishaan in a moment, he'll talk about one of the key products that we've just worked on with Starburst. But I want to talk about a couple of quick customers. One of them is called-

Mike Gualtieri:

George?

George Demarest:

Yes.

Mike Gualtieri:

Can I just comment on these real time products here?

George Demarest:

Yeah, please.

Mike Gualtieri:

So, before you go into that, I think these are perfect examples because anyone can think about them and imagine how quickly that has when you swipe your credit card or when you are making a trade or a personalized recommendation, e-commerce system. But what I think about when I think about these, I say, "All right, these are perfect." And then I think, "Oh my word, they're getting even more complicated," when you're going from these rules-based decision logic within these timeframes to now including one or more machine learning models. Because here's the thing, those machine learning models, you may not have all the data on the payload on the transactional information coming in.

The model may want six variables and you've got four coming in. What about the other two? Well, the other two are over there and they're far over there. And that's going to introduce latency that is unacceptable in this environment, which is why you need a database which can accommodate a serving back that to the models in real time. So again, there's this huge spectrum of latency requirements, which is real time. And almost all of these use cases, you also have high concurrency as well. So, not only does it have to be super fast with increasing complexity, but it has to do it with very high concurrency.

So, this is why I think specialized databases exist actually to be able to accommodate not just these use cases, but how they're getting even more complicated.

Adrian Estala:

Spot on.

George Demarest:

Yeah. So, just to further illustrate what our customers do, I want to talk about a couple really quickly. One is AppNexus. They are an internet technology company that does real time sale and purchase of digital advertising. Now, I know that's not the greatest topic, but no, it's a great company. And if you look at their workload, the New York Stock Exchange does 4.2 million transactions per day. Nasdaq 11.2 million and Visa the company 150 million transactions per day. AppNexus, does 10 billion transactions per day, give or take 1.5 billion. So, astonishing amount of processing, an astonishing amount of database operations.

And they never close ever. They have been running uninterrupted for years with our platform. Another one is the Trade Desk, another kind of ad tech company that is doing just between 2 million and 4 million transactions per day, but all day and every day, I'm sorry, that's a transaction per second. So, 4 million transactions per second all day every day. And they're a platform that also cannot close. So just wanted to give people a notion of how you achieve low latency at extreme scale, and that's where we live.

And with that, I want to kick it over to Ishaan. Ishaan is one of the rising stars in our product group and knows this product in this area very well. Ishaan, what else can you say about these use cases and what our customers are doing?

Ishaan Biswas:

Yeah. Hey, thanks George. So, just to tie us in some of the concepts that Adrian and Mike talked about and what George mentioned earlier. There are these real time applications that our customers are running with. The problem is that that serves the application and that data lives in Aerospike. These are highly strategic high value data items, but that's locked up in Aerospike. That's accessible only the applications because that's where it's transacted with. Now, how do you democratize that and how do you give access to more people who maybe don't speak the same language?

How do you give access to data analysts who want to look at trends over time or what's happening with that data without moving all of that data away from Aerospike into a data warehouse? Some of the concepts that Adrian talked about with the DataMesh and not having to ETL data and just process the data in place. There's a lot of work we have done in collaboration with Starburst and I'll talk about just one of the use case, it's a very canonical use case that ties in a lot of these concepts and brings it home.

So, this is one of the big global brokerages that's our customer. So, on the front side of it, so they use us for intraday trading operations and they store all the data in a respect. So, a lot of those real time requirements are driven by the customer, the user facing applications. So, when you're doing a transaction, you're making a trade, figuring out the risk profile of that, detecting that's a fraud or not, giving you more information about your trading profile. That was very high requirements on the latency that you need to get the data in.

And also, there is a big global brokerage. They have millions of customers as well so you need have high concurrency. So, that's the front side. So, right side of this picture, that's where the operations are happening. And then that data gets fed over into the system of record in the mainframe where they're reconciling all of that information for historical reasons. But then in the middle where the data actually sits, there are other constituents in the organization who want to look at that data. I think of the idea that Adrian talked about of building a data product.

This is real time data that's sitting in Aerospike and that's not easily accessible without having a tool like Starbust on top of it. So, this is one of the reasons we built the product called Aerospike SQL powered by Starburst, is to provide a SQL access. The SQL is like the language of the streets, so to say, for data analysts and it's very popular. So, giving that access and democratizing the access to the information so that more people can extract value from the data that's captured in Aerospike. So, we move to the next slide. I'll describe a little bit about what the product is and what we do with it.

So, this essentially basically ties into the data product and the DataMesh idea that Adrian talked about. You have data sitting in Aerospike, this is real time data that's for high transaction use cases. Now how do you report data out of it without migrating, without having to ETL the data into a data warehouse? You process the data in place and that's why we use Trino query engine that's supported by Starbust. So, we had a Trino connector that we built some time ago and we found the perfect home for that with this product which bundles this Starburst enterprise platform and Aerospike and the Trino connector.

To make it easy for customers to deploy this and so that they're able to access the information and report on it, analyze the information, analyze the real time data that's in Aerospike. Without going in to too much detail on the technical side, it's essentially with this product you're able to get, use the Trino CLI. Use tools like Tableau and Looker and build Jupyter Notebooks to build more interesting views of that data based on the perspective that other people would have other than the application developers.

So, we've seen a lot of traction in the market and with our customers specifically in having this capability and it perfectly ties into the data products concept, especially that Adrian talked about. We are giving access to the data where it resides. Yeah, let me move to the next slide. So, this is just tying this whole concept together, what George you talked about earlier with Aerospike being able to provide predictable real-time performance at any scale, a hundred gigabytes or a petabyte or more. You get the same level of performance in terms of throughput and latency.

We are talking like single digit millisecond latencies here for any scale and we support key value and also JSON. So, it's very powerful for applications to use this database, but because of this marriage made in heaven with Starburst, we are able to bring that value and extract more value of that data to other constituents in the organization. Adrian, if you can talk a little bit about Starburst and how that ties in to Aerospike, that'd be great.

George Demarest:

Muted.

Adrian Estala:

There was some more noise in the background. I said, you got me so excited about Aerospike, I'm not sure if I'm ready to talk about Starburst. I watched a couple videos last week to just make sure that I understood your technology and it's super impressive. And I know that you've only gone through a couple slides here, but it's complex, what you do is not easy. It is complex and if we had a video on trying to describe what you do and how you do it, most of the consumers would shut down in five minutes because it's too much. But the point I'm trying to make is you make access to complex data sets easy for the consumer. And for us on the other side, we're like, "Yeah, I don't know how they do it, but they do it."

Here's what I think is most interesting. There are data sets in the areas that you currently access and you make available that consumers have not had access to, at least not in this way. In the old way, they had to access that data, maybe do certain kinds of applications, or they only get the results work that somebody else has done. You're opening the door now and saying, "Hey, we can make this data available to you." These data sets that were impossible to get to with speed and reliability, as Mike said, we can do that now. And then with Starburst, we make it easy.

So, maybe Aerospike may not get all the credit. Now, they're going to say, "Oh wow, Starburst made this part easy." But the reality is Starburst and as you said, that that marriage is making it work together, but that's the real goal going back to the consumers. You're opening the door to data sets that they've never been able to use before, and we're giving them the ability to start to ideate and discover. That's what I think the competitive advantage is going to be is when these teams start to really act as that data in new ways and use it in new ways. And so that's what's exciting.

Mike Gualtieri:

Yeah. And this is Mike from our perspective here at Forrester too. Adrian, completely agree with what you said near the beginning when you showed this, when the ecosystem slide and you showed all these different vendors. You said, "Well, it's not just one vendor that it's not just one competing vendor." There's a necessary cooperation among vendors here and partnerships because of the incredible diversity of the types of data products, the architectural requirements of latency, amount of data. You can imagine use cases where, oh, well we're transferring sound. It's a sound files, audio files or computer-

Adrian Estala:

Video.

Mike Gualtieri:

There's all kinds of different, or you can imagine even complex data product that includes video along with some data from a database. So, I think when we look at this market, we look at the entire ecosystem and we look at the partnerships and the cooperation among vendors because we think that's key to implementing successful DataMesh.

Ishaan Biswas:

Just to add to that, and one last point for me. From a technical perspective, it separates this whole notion of separating compute and storage. This product allows you to do that because Aerospike is a massively distributed database, and that's your storage engine. And with Starburst and with Trino specifically, we bring in a massively parallel SQL engine and both can independently scale. So, that gives you enormous power to access, not only access that data, but access that data in a really, really high performance way.

Adrian Estala:

Absolutely.

George Demarest:

Wonderful. Well, that gets us towards the end of our presentation, this has been hugely interesting. So, I just want to summarize the takeaways of what we've been talking about that really the mission of Starburst in delivering SQL insights on unheard of scale. And now with real-time data supplied by your Aerospike databases is a key facet to successful DataMesh and data product strategy. That Starburst has been promoting this DataMesh vision and doing a lot with their products and with their company. And so, in that way, they have really brought Aerospike along with them.

We should mention, I think we did mention, but that the Aerospike SQL powered by Starburst was actually really the idea of that big customer that Ishaan talked about, that they wanted an integrated solution with integrated support. And as Mike suggested, that's what vendors can and need to do to make DataMesh a lot easier, less painful, less expensive, more efficient and so forth. And then finally, the DataMesh provides a vision, a set of behaviors, a bunch of ideas about tooling and technology. It's a big topic and lots of people are really very excited about it as per Mike.

So, I want to at this point just get some final thoughts on the topic on anything we've discussed today. I'm going to turn first to Adrian. Adrian, what do you think?

Adrian Estala:

It was a great conversation. It went by too fast, we need more time. As a customer, so a year ago, I was a customer when I was on the other side, if you will. I was always frustrated in me whenever I had to integrate two different vendors or integrate two different systems and it happen all the time. It's like, "Hey, it'd be great if you could work with these guys." And then they would say, "Yeah, we agree, you do it," and then I'm paying somebody to integrate. What I love about this conversation is you've made that easy for a customer and you've made it easy for other customers.

We've done the work on the backend to make our products work really well together to solve a really important use case. A use case that's going to really create more valuable DataMeshes but we've made it easy for the customer and I want to call it out because as a customer, I always appreciate it when vendors did that. So, nice job here. Thank you.

George Demarest:

Thanks. Next Ishaan, any final thoughts about the topic today?

Ishaan Biswas:

Yeah. It's really hard to follow Adrian. He speaks very eloquently about the DataMesh and [inaudible 00:41:16]. No, this is just very exciting conversation for me to understand the DataMesh and how beautifully this product fits into that overall vision. So, I'm really excited for customers to extract value and just see the value that some existing customers have already seen.

George Demarest:

Thanks. So, Mike, I'm going to get you, the guest speaker, I'm going to give you the final word on the topic. Any final thoughts?

Mike Gualtieri:

Yeah. So, I go right back to the title of this webinar, which includes the word real time. And so, I think when people are thinking about a DataMesh and formulating their strategy for the DataMesh, there is low-hanging fruit that may be in a data lake that you can extract from or from a data warehouse. And it's low-hanging fruit for a reason because it doesn't have some of these low latency requirements, there's immediacy. So, I think when people think of this, they can't think of DataMesh as having one uniform requirement to create the products. They have to think about that spectrum, that latency spectrum I think that's very important.

At Forrester, we've developed this framework, we call it perishable insights. And the idea is that some of the insights you get from the data are perishable, right? And that's like fruit rotting or something, except in the world of data and applications that tends to be hours, seconds, milliseconds, and maybe even lower. So, companies need to identify those data products that are going to provide those perishable insights and then apply the right technologies that are going to be able to accommodate that within the DataMesh.

So, it's going to require not one universal product to solve that. It is going to require an ecosystem of the right underlying products to support that full range of latencies.

George Demarest:

Very good. Well, I want to end by first thanking my fellow speakers. We have Mike Gualtieri, guest speaker from Forster Research, Adrian Estala, VP and CDO from Starburst. We have Ishaan who's a director of product management at Aerospike and I'm George Demarest from the Aerospike product team as well. I just want to give you some opportunities to learn some more, there is another webinar about Starburst and Aerospike about the product. You can also check out anything on the Aerospike real-time data platform and Aerospike database six.

Also, Starburst has a great deal of good information on DataMesh, a lot of which authored by Adrian himself. So, there's the URL there, starburst.io/data-mesh. And the Aerospike SQL product is aerospike.com/sql. And with that, I'm going to end the presentation. I want to thank everyone for joining me. I hope you're well wherever you are and enjoy the rest of your day or night. Thanks everyone.

Mike Gualtieri:

Thank you.

Ishaan Biswas:

Thanks.

About this webinar

As organizations continue to emphasize digital transformation, we are seeing a new generation of data products that involve real-time data, SQL, and JSON document stores. In this webinar, featuring Forrester Research, learn how Aerospike and Starburst work together to deliver exceptional real-time data products.

ANSI SQL has remained stable and consistent for decades even as data types, use cases, and analytics approaches have evolved aggressively during the big data era. With the success of the Starburst platform, Trino, and the Data Mesh concept, SQL is once again at the frontier of a new generation of data products that align with an organization’s structure and priorities. These data products are powering digital transformation activities across all industries and often involve real-time data and JSON document data stores residing in the Aerospike real-time data platform.

This webinar will explore how an integrated Aerospike and Starburst environment delivers on the promise of Data Mesh and the data products it inspires. Among the topics covered are:

– An executive overview of Data Mesh – Discussion of real-time data products – Best practices and how to get started

This webinar will be presented by Adrian Estala, Field Chief Data Officer at Starburst, Ishaan Biswas, Director of Product Management at Aerospike, and guest speaker Mike Gualtieri, VP and Principal Analyst at Forrester Research.

Speakers

mike-guiltieri
Mike Gualtieri
VP, Principal Analyst
ishaan-biswas-41c83f272f74e43a110ea3f9105d1422
Ishaan Biswas
Director of Product Management
adrian-e
Adrian Estala
Field CDO @ Starburst