Achieving cache-level performance without storing data in RAM
**Sebastian:** Yes, hello. Here I am, Sebastian. I am responsible for marketing, and, uh, welcome everyone, um, for this webinar. Barett is going to talk for forty to forty-five minutes about, um, achieving cache-level performance without storing data in random access memory. But, uh, before we do a proper introduction, um, we are going to give it a few more seconds because I see people are dropping in.
**Barett:** Yeah, many, many people. Yeah, hi, everybody. I also see some familiar names, um, some ex-colleagues, uh, so welcome. I am very happy that you are here. Uh, so we are going to wait a little bit longer, I guess two more minutes for people to join before we start.
**Sebastian:** So, just out of interest, um, you guys can see the chat box. Maybe you can post where you are located. This is always interesting to see, um, where you guys are all spread all over. By the way, I am in Barcelona, but I am not Spanish; I am German. We are going to have a nice meet-up here tonight.
**Barett:** If you did not say you are German, nobody would have guessed.
**Sebastian:** Yeah, but I realize, um, walking through, um, Barcelona, everybody speaks Spanish to me, so at least I do not look like a tourist, which I am not, sadly. Okay, still people are coming in. That is very good.
**Sebastian:** So, yeah, so welcome, everyone. Uh, my name is Sebastian Eming. I am responsible for marketing at Aerospike in the European region, and, um, thank you very much for attending this webinar. We are going to record it, yeah, so if you miss something or if you want to rewatch or share it with your colleagues, um, I will send around the recording afterwards, so no worries. Um, you can also see a chat box up, and, um, please post any questions or comments, and we will discuss that when Barett is finished with the slides at the end of the webinar. Um, yeah, this is about it. Um, I leave it up to Barett to introduce himself and, uh, to get started. Yeah, looking forward.
**Barett:** Thank you, Sebastian, thank you very much. So, uh, again, hi everybody. My name is Barett Bab. I am a Principal Solutions Architect at Aerospike. My background is I have been a developer for most of my career. Some of the people in the audience, you know, we worked together—Pete, hi! So I worked in financial services for a number of years. Before that, I was in Internet of Things and industrial automation fields, but in the past seven years, I have been working for different NoSQL databases, and Aerospike is another NoSQL database company. We are going to talk a little bit about it. It is not necessarily only about Aerospike. We are going to talk about this concept that may sound a little bit strange, but before I want to start talking about the concept, I just want to tell you a quick memory. I remember when I was a kid, when I wanted to watch television—the television, you know, if you liked the program and you wanted to watch that program, you had to be in front of the television at the exact time. Possibly there were a couple of repeats for that program as well, but if you missed that, there was no other way to watch that program you wanted to watch.
I remember at the time, I was a bit of a boy. I went to my brother, who is older than me, and I asked him, "Well, it would have been great if you could watch these programs on demand." And I remember my brother laughed at me. I exactly remember this because it is, you know, one of those memories that you never forget. He laughed at me and said that it was impossible, you know, for television to be able to show you something on demand. The antenna that was sending the signals to television sets should send a different signal to every television set in the country, and that is almost impossible—you cannot have television on demand.
But thirty to thirty-five years later, everything that we watch now is almost on demand. I mean, everything is possible. So, you know, the technology landscape is moving very, very quickly, and things that we think are impossible are becoming possible and are also becoming the norm. So, the topic that I am going to talk about is kind of like that. I am going to talk about how you can achieve cache-level performance without storing data in random access memory. You know, we usually store data in random access memory to improve performance, but…
**Sebastian:** Barett, can you mute yourself, please?
**Barett:** Thank you. So we usually store the data in random access memory to achieve the great performance, but, uh, you know, we want to talk about whether it is possible to store the data on something cheaper, like a disk, and achieve the same level of performance. So this talk is going to be about that. If you are ready, let us fire up the engines and basically get on this journey to learn how we can do that. But, by the way, I spent around one hour to create that animation that you just saw, so I just want to show this to you one more time because, yeah, it is one hour of my time creating this. Just look at it, it is glorious.
**Sebastian:** Okay, still people are joining, but I think we are ready to get started. Please continue, Barett.
**Barett:** Okay, great. So, if you want to store data in a computer, you have two main different storage places. One of them is memory; the other one is a disk. And these two storage components have some opposing characteristics, right? So, for example, the memory response time is very, very fast—you know, it is working in the order of nanoseconds, accessing the memory—but accessing the disk is in the order of microseconds, so it is like a thousand times slower. And this misconception that, "Oh, if you need fast, you have to keep everything in memory, and if you want speed, the disk cannot be fast," is basically coming from the idea that nanoseconds are significantly faster than microseconds.
Another difference is that memory has a massive throughput, right? So, the modern modules can handle something around fifty to sixty gigabytes per second of throughput. When you go to disk, it is slower. You know, disks have been improving in the past twenty years; currently, there are disks in the market that can handle around fourteen gigabytes per second, but it is still, you know, around the order of magnitude slower than memory. The other thing is expense: memory is expensive, and the disk is relatively cheap. I actually found something on Amazon, and I just wanted to show you to get an idea. So, a one-hundred-and-twenty-eight-gigabyte module is around one thousand one hundred pounds at the moment, and you can buy it, and you can receive it basically with next-day delivery from Amazon. So if you look at the price per gigabyte, it is something around nine to ten dollars per gigabyte for random access memory, but if you go to disk—this is one of the fastest disks that you can buy at the moment, you know, this is the fourteen terabyte speed that I mentioned—a four-terabyte module is around six hundred and ten pounds. So this is roughly around fifteen pence per gigabyte.
So, if we go back here, you can see that random access memory is significantly more expensive than disk. The other thing is that memory is limited, you know. As you saw, it was one hundred and twenty-eight gigabytes, whereas disk is significantly larger. We had a terabyte module there; there are eight-terabyte, sixteen-terabyte modules, and then you can have multiple of them on a single machine as well. So, you usually have significantly more disk on a single machine in comparison to the amount of memory available to you. And lastly, disk is persistent, meaning that if you write something to the disk, and you turn off the machine, and you turn it on again, the data will still be there. But memory is volatile, meaning that if you write something into memory and turn off the machine, when you turn it back on, the memory will be wiped clean. So basically, the data is going to be lost from memory.
When you are creating a database system, or well, any other system that needs to store some data, you need to pick and choose different data structures. Depending on these characteristics of disk and memory, you have to place them on these modules. So, let us have a look at what you need to do in terms of storage requirements for databases. The first thing is that you need to store the data somewhere. That is possibly the most simple one: the data needs to be persistent, so the data has to go to the disk. We keep the data store part of the database on the disk. The other thing is the primary index. You have a primary index in any database that you are familiar with. The primary index enables you to query the data given the queries that you are sending to the database. So the primary index, because it is a data structure that is getting updated when inserting data and allows you to quickly find data, is better kept somewhere with massive throughput and low access time. Therefore, we usually keep the primary index in memory. Another module that requires storage is the secondary index. Again, very similar to the primary index, the secondary index is usually stored in memory as well. And then the last component that is very common between databases is a cache.
**Barett:** So, basically, the cache says that if data was written recently or read recently, keep it in memory so if a request comes in, you can answer it from the memory, which is faster, instead of going to the disk, which is slower. So, usually, the cache is also stored in memory as well. This caching idea is to improve performance, but it is possibly not the way that you are thinking about it, and I want to talk a little bit more about that.
You know, in a computer, you have a bunch of memory and a bunch of disk. Usually, the disk is significantly larger than the amount of memory. When you create a database on top of these data storages, the thing that happens is that your database is going to receive a bunch of requests. Some of these requests are going to be available in random access memory, and you can respond to them very quickly, but some of the requests are not going to be in random access memory, so you have to go to the disk. It is going to take significantly longer to get the data from the disk, and you respond back from the memory. Some of the requests are going to be significantly faster, and some of the requests are going to be significantly slower.
Because of this, usually people think that increasing the amount of memory that is dedicated to cache, either by allocating more of the memory to the cache or increasing the amount of random access memory on your machines, can improve the performance of the database. But this is not necessarily correct. So, for that, I need to explain something to you, and it is basically how we measure response time.
Think about response time. You know, to explain response time, think about this: imagine I start logging the requests that I am sending to my database over a very long period of time—let us say one year. So, all of the requests that are being sent to the database, I log the time that it took for them to be responded back. Then I sort this list from the slowest to the fastest and create a visualization. And then I plot this on a chart like this. So, this shortest line is my fastest response, and this longer one is my slowest response. It is always going to look like this. I mean, we do not know, we do not have any measure here. It might be one millisecond, it might be one second, this might be twenty milliseconds, it might be twenty hours—it does not matter, but the shape is always going to follow this. Basically, it means that a very large portion of the requests are going to be very fast, and they are not going to be very different from each other, right? This is the normal situation of the system—all of the requests are going to be roughly in that area.
But we are talking about a system that does not have that much memory to cache any data, so you always go to the disk. So, it is going to be something like this. And then there is a very small percentage of the requests, again, this small percentage for a system might be ten percent, for another one it might be one percent, for another one it might be, I do not know, 0.1 percent, but it is going to be like that. These are the situations, you know, these bad requests, these bad response times, happen in situations where, I do not know, there was a failure in the system. I do not know, your disk failed, you had to restart one of the nodes to upgrade the operating system, there was a network glitch. I do not know, it was nine o'clock in the evening, and you sent a request, and Java was taking the garbage out. Garbage collection does not work like that, but anyway, you know what I mean. The garbage collection was happening at the time, so your response was stuck, and you did not get the response back. So, you know, these are the bad situations that can happen in any system during a long period of time. So, those bad requests are always going to be there.
Now, if you put a cache in front of this database—or, I mean, it can be an external cache that you put in front of the database, or it could be a database that has a caching component built into it—if you do something like that, the response time is going to become like this. Basically, you make the fastest queries that you were running on the database faster, but you do not do much about the slow ones, right? So, this basically breaks here; this is your cache hit rate. You know, if your cache hit rate is fifty percent of the requests, fifty percent of the requests that were already quite fast are going to become significantly faster. But that fifty percent that were not fast are going to remain almost the same; they are not going to change significantly. So, this is the effect of caching, whether the cache is built-in or an external system in front of the database.
What would be the effect of something like this? It has many implications, but I am going to talk about four of them that I think are the most important. The first one is that when you build applications in front of a database that has these characteristics, it might work on your machine, but when you take it to some other place, it will not work. Let me explain why I am saying that. When you are testing your application against the database, this database has a bunch of random access memory. I do not know, either you are running the database on your local machine through Docker, or you have a virtualized machine somewhere else—whatever, this database, wherever it is located, is going to have a bunch of random access memory. Let us say you have just one gigabyte of random access memory allocated to this virtual instance—it still has a few hundred megabytes of cache available. So, unless you write more data than this hundred megabytes of data, whenever you test your application against this database, you are only testing the cache; you are not testing the database. So you see one performance, but when you take it somewhere else and you have a larger amount of data, it is going to behave differently.
You might say, "Oh, but I can take it to pre-production; pre-production has more data, has more resources, and I can test it correctly." But again, pre-production does not necessarily have the same traffic as your actual application, so the cache is going to be populated in a very different way compared to the real-world scenario. Because the cache is going to be populated differently, the things that were written or the things that were read in the previous test case were different from the actual use case, so you are going to see different behavior in your application. So something might work in pre-production, and you take it to production, and it just fails to operate the way you wanted it to. It can be even worse. You can take it to production, everything is fine, and it is working, but then suddenly a pandemic happens, and people start buying toilet paper left and right, and your cache is going to be invalidated for reasons that you do not know. Then everything starts failing, and you would not know why things are starting to fail, because everything was working correctly, and your system was built to scale to the level that you wanted, but now because the cache is being populated in a different way, you might see your application failing in production.
But I would say that the worst one is that increasing the size of the cache is not going to improve the latency of your application. This might sound a little bit bizarre, but let me explain it. Yes, we saw that if your cache hit rate is fifty percent, you are improving fifty percent of your requests. Fifty percent of your requests are going to run, let us say, ten times faster. But the problem is that you are not looking at single, individual requests. If you want to improve the performance of your application, you have to improve a bundle of requests. Let us go back to this Amazon page for an example. How many database calls do you think have been made to render this page? I am pretty sure there is one call to get the information regarding this product, possibly one call to get the items in my basket, and there are some other calls to get the items that were bought together with this product, another call to get the four-star and above related items, and some similar items here. As you can see, you usually create multiple calls to separate systems. It is not necessarily one system—there are separate calls to different microservices, each of which might have a different database to fetch the data.
The time that it takes for this page to load is dependent on the slowest running task. In this case, the sub-operation number five is taking this long—let us say 200 milliseconds. So, that page is going to take 200 milliseconds to load. It does not matter that this one is faster; the slowest one is going to dictate how fast the page load time is going to be. Now, let us say we put a cache in front of our database or increase the amount of memory in our database so it can cache more data. The thing that happens is that some of these requests—the sub-operations from one to n—are going to hit the cache, and they are going to run faster. So, this pink line is basically showing that they are running faster. But if not all of them are answered from the cache, if just only one of them is answered from the disk, the response time is still going to remain the same. The response time is not going to change—it might become a little bit faster, but you are not going to experience that order of magnitude improvement that you expect from putting a cache in front of the database. That is true even if your cache hit rate is ninety percent. If you are doing, I do not know, five operations, 0.9 to the power of five is just going to improve fifty percent of your operations to the level of cache, but fifty percent are not going to experience much difference. So, cache does not really improve the performance.
Now the question becomes: if cache does not really improve latency, why does every single database in the market have some kind of caching? The answer to that is kind of simple: cache does improve performance, but performance is not only about latency. Cache actually enhances throughput. So you can increase the throughput of your database instead of its latency.
Let us go back to this picture that I showed you earlier. We said that the requests going to the disk are going to take significantly longer than the requests going to memory. But there is one other thing here: in this one, we assume that fifty percent of the requests hit the cache. In this situation, only fifty percent of the requests went to the disk, while another fifty percent that should have gone to the disk were answered from the memory. If you remember at the beginning, we said that the throughput of memory is higher than the throughput of the disk. So, by putting a cache in front of your database, you can improve the throughput that your application can handle. That is the main reason that a lot of technologies put a cache in front of their disks.
Latency and throughput are not necessarily related to each other, but if you put too much pressure on one item, for example, if you are trying to read from the disk at a higher rate than the throughput it can handle, then tasks will be queued, and if tasks are queued, the latency is going to increase as well. We are not talking about a situation where the system is overloaded—in a normal situation where the system is not overloaded, this is going to be true. But putting a cache in front of the database is going to reduce the throughput going to the disk, and therefore increase the throughput that your entire system can handle. That is the reason that databases, and lots of other technologies, have a cache in front of the slower storage system.
As I said at the beginning of this conversation, things have changed. The disks that we have today are around seventy times faster in terms of throughput than the disks that we had twenty years ago when most of the database technologies that you work with were designed. Around twenty years ago, the fastest disk that you could get was able to handle around 200 megabytes per second, and as you saw, a commodity disk that you can buy ten of from Amazon and receive them tomorrow morning for a reasonable price—600 pounds—those disks can handle 14,000 megabytes per second. So, it is seventy times better. You might say that everything is getting faster nowadays, but no, not everything else is getting faster.
The CPUs that we have today have the exact clock speed that our computers had twenty years ago—around two and a half gigahertz. We can overclock things, but normal CPUs twenty years ago were around two and a half gigahertz, and current CPUs are still around two and a half gigahertz. So, nothing is getting faster. With CPUs, you are getting more cores on a single CPU, so you have more processing power on a single CPU, but the speed at which it processes data is still the same as it was twenty years ago. It is almost the same with memory as well. Comparing memory, you can see that, I do not know, the speed has doubled almost since twenty years ago, and we have significantly more RAM on our modern machines in comparison to twenty years ago, but the speed of the memory has not changed much. Modern disks, however, are seventy times better in handling throughput, and that is crucial.
We said we put a cache in front of the database to handle more throughput with our database, but if you can do something with this extra throughput that is available to you from the disk, you might be able to do something better than that. Basically, that gives you an idea of what would happen if you create a database that does not have a caching layer. That is basically the idea of Aerospike.
Aerospike is a database that stores all of your data on disk. It does not cache any part of the data in memory, but it provides you with very, very interesting performance, which I am going to talk about in a second. So, let us go back to this picture again. We said that when we are creating a database, the data store is going to be on disk, our indices are going to be in memory, and we cache data in memory as well. If we drop the cache, we can dedicate 100 percent of the memory to our indices. So now, if we go back to this diagram that we had—a bunch of memory and a bunch of disk in our database—we can do something like this. We can create a massive data structure that has the location of every single data item in the database in random access memory, and then keep the data on the disk without any data structure.
When a request comes in, you find the exact location of that data to the exact sector on the disk. You find that, and with one access to the disk, you read the data from the disk, and now you have the data ready to return it back to the customer or the client. If we do something like this, the result is going to be the following. We said memory is very fast, and we said that the response time of accessing memory is in the order of nanoseconds. If you have billions of objects in your database and everything is in random access memory, you draw here a tree—but it does not matter—you have some kind of data structure that is usually working in the order of log n or n log n or whatever. It is going to be able to find the data that you are querying with a few accesses to the memory, so you find the location of the data on disk in a few nanoseconds.
Then we said that we know the exact location of the data on the disk. Modern disks, other than being very fast, do not have the problem of random access that we had twenty years ago. You can very quickly access any part of the disk—it usually does not matter much whether you are reading sequentially—and also, they provide parallelism as well. Modern disks can handle hundreds of parallel reads at the same time. You can very quickly go to the disk and say, "Give me this many bytes from this sector," and you read that, which is going to be in the order of microseconds. So, nanoseconds to find the location on the disk, and in microseconds, you read the data from the disk and it is now available. Now, you put it on the network to return it back to the customer or client, which is going to take around hundreds of microseconds. Hundreds of microseconds for network latency, a few nanoseconds for memory latency, a few microseconds for disk latency—putting it all together, the performance of this request will be in the hundreds of microseconds.
Basically, this number is in the hundreds of microseconds because your network is in the order of hundreds of microseconds. You would not see these faster steps; this is the thing that is going to dictate how fast your response is going to be. So, the interesting thing that happens here is that something like Aerospike—a database that has this cache-less mechanism—where no part of the data is cached in memory, and we go to disk and also go to memory—if you configure Aerospike to write to store the data in random access memory, it basically works as a cache. You can configure Aerospike to do that. If you do something like that, it becomes like this: you can store your data in memory as well, and still, this response time will be in the order of hundreds of microseconds.
So, the performance that you get from this architecture, with all of the data and the index in memory, is still going to be in the order of hundreds of microseconds. Basically, you have two systems—one stores data on disk, and the other stores data in random access memory—and from an observer's point of view, both of them are working with less than millisecond latency. That is the thing that I said at the beginning: you can achieve performance similar to a cache without storing data in random access memory. This is how you can do it.
I have to make some exclamation points here. The first thing is that if you do not have this network latency—if it is your local process and you are thinking about reading from the disk and reading from the memory—then it will be thousands of times faster if you read it from memory. So, there is nothing much you can do there, but you can have a database or another system that is sitting over the network that reads the data from the disk and performs similarly to technologies that keep the data in random access memory.
One other thing is that this seems very simple, right? You might be wondering why nobody else is doing that. Creating the system is actually quite a complex thing. The idea is as simple as I explained to you—there is nothing more than this—but implementing this idea—having a very large index in the memory and always reading the data from the disk while keeping the size of the index small enough that you are not wasting memory and relocating the data written to the disk efficiently—is actually a very, very difficult task. That is what we have done in Aerospike. The idea of how Aerospike can achieve performance similar to a cache that stores data in random access memory is basically as simple as I said.
Then, the other question that you might have here is, "Well, if Aerospike is performing very similarly to a cache by storing the data on disk, why do we have a mode that allows us to store the data in random access memory?" The answer to that is going to be another talk, but I do have a few articles that explain why a cache might still be relevant to your use case, which I am going to share with you at the end of the presentation. But that is the idea: Aerospike can achieve cache-level performance without storing any data in memory.
What happens if you use something like Aerospike compared to other databases that you are familiar with, all of which have a caching layer? MongoDB, for example, has a caching layer—the recommendation is to have fifty percent of your data in the cache. So, if you have a one-terabyte dataset, you have to have five hundred gigabytes of memory available for it to be stored in the database. If you have a replication factor of three, that five hundred gigabytes becomes one and a half terabytes, and do not forget that MongoDB also has an index, which is also in the memory. So, all of that is going to be there. For Aerospike, you only need to keep the index in memory; the data can reside on the disk, which, as you saw, is significantly cheaper than memory.
The other thing that would happen is that this gray line—basically, the gray lines that you see here—is a database that does not cache the data much, does not have enough memory, or does not have a built-in caching layer. The performance is going to be like that. When you increase the amount of cache, the performance starts to become like this blue line, but with a database similar to Aerospike that does not have a caching layer, all of the requests are going to perform as fast as a database that has a caching layer, unless the part of the data that is not as fast.
Some of these bad things can happen to Aerospike as well. If there is a network glitch or if you need to perform maintenance, these bad things would still happen. But because Aerospike is significantly faster, these bad situations are recovered much more quickly than in other technologies.
**Barett:** Okay, let us recap what we have discussed so far. We said that Aerospike can achieve the performance of an in-memory database, but the data is going to be persistent. If you store all of the data in memory, it is going to be very fast, but it is not going to be much slower if all of the data is stored on the disk. The data is going to be persistent.
The next level is that it does not require a large amount of memory to deliver high performance. If you assume that the size of the index is one-tenth of the size of your data, you can significantly reduce your dependency on random access memory while still getting the performance that you want.
The entire dataset is going to be accessible with the same response time. Lots of other databases—let us say Couchbase, which is made with the idea of having a cache in front of the database—if you read part of the data that is in the cache, it is going to be very fast, but if the data is not in the cache, you have to go to the disk, and it is not going to be as fast. With a technology like Aerospike, it does not matter if you wrote or read this data two minutes ago or ten years ago; when you access the database, the response time is going to remain the same across the entire dataset that you have.
In this talk, I mostly told you the story of read-heavy processes. I did not tell you much about write-heavy processes. Aerospike writes are very similar to the reads—they are very similar to the cache as well. I just did not go through it because we have a limited time and I did not want to confuse you by telling you all of the details of Aerospike, but Aerospike can handle both read-heavy and write-heavy and mixed workloads similar to a cache.
The next one is that this faster performance does not require more resources. Usually, when people think they want to run something faster, they think about adding more resources to the system. Aerospike does not need more resources. Actually, there is an interesting thought process about computers: the only way you can make an application fast is by making it use fewer resources. An algorithm that requires fewer CPU cycles is faster than an algorithm that requires more CPU cycles. Aerospike is faster because it uses fewer resources—it uses fewer accesses to the disk and requires less random access memory.
Lastly, because it uses fewer resources, it is more sustainable. All of these resources and the production of these resources produce a lot of CO2 emissions, and running these resources also uses a lot of energy, which results in CO2 emissions. By using a more efficient and faster solution, we can reduce the amount of emissions and make your company more sustainable.
As I promised, here are three articles that discuss bits and pieces of this talk, and there is more information in them as well. The middle one is "Is Caching Still Necessary?" where I talk about why caching might still be relevant to your use case, which you can read if you are interested. I think Sebastian is going to send you an email with the links to these articles.
With that, thank you very much. I am happy to answer any questions you might have.
**Sebastian:** Barett, thank you. I think it is great that there is so much more content and potential for more webinars, to be honest, through different channels. I did get some questions. I think a very classic question here is, what types of companies typically use Aerospike?
**Barett:** There are many different types. Aerospike is a company that is trusted by many technology vendors. You can check our website. From companies like PayPal to Barclays to lots of other very famous companies, many are relying on Aerospike. I would say the typical companies that use Aerospike are innovative companies that want to compete in the market and provide the best performance to their customers. You can see people in financial services relying on Aerospike. We have companies working in more modern financial services like blockchain and Bitcoin using Aerospike. We also have ad tech companies, multimedia companies, gaming companies—well, I do not know if you are young enough to know what Snapchat is, but they are using Aerospike heavily in their systems. So yes, many companies are using Aerospike.
**Sebastian:** This is great. Another question that came up was, what kind of infrastructure is needed to run Aerospike effectively?
**Barett:** Aerospike is not really dependent on any specific hardware as long as you give it modern disks, and by modern disks, I just mean SSDs. Aerospike does not work really well with spinning disks, but as you possibly know, spinning disks are old technology now. I have not seen any new spinning disks in the market in the past four or five years. As long as you have a modern SSD, preferably with the NVMe interface, Aerospike will work with that, and you would not have any problem running it on your laptop, on a server, or on any cloud provider.
**Sebastian:** Here someone seems to be already thinking about running Aerospike because the question here was, how challenging is it to run and maintain Aerospike?
**Barett:** That is a good question. Aerospike is a distributed system—we did not cover the distributed part here—so it is a multi-node system, and you can scale it up and down very quickly and easily. That is one side of it. Aerospike also has a unified architecture—it is just one process. Most databases have lots of different processes that you have to run, and if one of them fails, part of the functionality of the database is lost. Aerospike does not have that. It is just one process—you run it, and it works fine out of the box. You do not really need to tune it. It is written in C, so there is no Java garbage collection tuning—it just works. You do not need to do much. As I said at the beginning of this talk, I worked with lots of different database technologies as a developer and as an administrator, and I have worked for several database companies in the past seven years. I would say that Aerospike is very, very easy to run in comparison to the other technologies.
**Sebastian:** This is good news. My question is, if it took you one hour to create that animation with the rocket on slide one, how long did it take you to design this inspiring last slide here?
**Barett:** Well, you know, I did it myself. I just went to ChatGPT and wrote, "Create this beautiful thing for me." But yeah, not able to spell correctly, but the picture is quite cool.
**Sebastian:** Yeah, it is. Thank you, Barett. Before I let everyone go, I have some announcements here. Let me just share my screen. Can you see my screen?
**Barett:** Yes.
**Sebastian:** So, here you can see the upcoming events. I will share this page with you. I just want to be a little bit practical. You guys can always update yourself on what is going on. For example, here is the webinar where everyone participated, and then there is the meetup we were talking about—why I am here in Barcelona. More than one hundred people are already joining at Critéo's office, one of our biggest ad tech customers. This will be great. Tomorrow, at TomTom, we will have a great meetup—also more than one hundred people. TomTom will also explain their story of how they are using Aerospike. Then, in the US, in New York, our founder, Srini, will have a meetup. It is just a six-hour flight from Frankfurt. Then next week, there will be another webinar on a global level with someone from Forrester and Aerospike.
Just for everyone to know, please have a look at this on a regular basis, and you will know what is going on. This will also benefit you because we have a really nice LinkedIn page. Here you can see Barett. Barett already promoted Aerospike, and you can find a lot of cool stuff here—not just events but also a lot of content. So please sign up. I will send these two things right away, now, after this webinar, and in the coming days, I will share the recording with everyone, and that should be it. It has been almost an hour, but it seemed much quicker. Thank you, Barett, for making this such an interesting webinar, and let us do it again.
**Barett:** Thank you very much. So, we will have a meetup tomorrow in Berlin. If you are based in Berlin, come in and listen to the exact same talk live. We will have more content coming up in November—we are going to be in Stockholm for another meetup. I am preparing another talk as well. You saw some of the articles that I have written around this topic. I have another article that I am writing at the moment, where I am going to talk about one-tier and multi-tier response time systems. If you are interested, follow our Aerospike page—it will be published there. You can add me on LinkedIn as well. I would be happy to have you in my connections, and you can follow up on those conversations as well. Thank you very much.
**Sebastian:** Okay, bye. Cheers.
About this webinar
Want to know how you can revolutionize data processing? Behrad Babaee, Principal Solutions Architect at Aerospike will explain innovative, cost-effective caching technologies that go beyond traditional RAM dependence. You’ll learn:
How to enhance system performance and scalability with these cutting-edge solutions
What are the latest advancements in non-volatile memory and disk-based caching
How companies are achieving high throughput, cost effectively