Aerospike Vector opens new approaches to AI-driven recommendationsWebinar registration
Blog

Scaling for the next era: Aerospike and HPC data centers in action

Discover challenges in scaling computing systems and traditional databases to handle next-gen applications requiring real-time data processing and analytics at unprecedented speeds.

Steve Tuohy website
Steve Tuohy
Director of Product Marketing
September 17, 2024|9 min read

As the world accelerates toward the next era, the demands on telecommunications (telecoms) data infrastructure are set to exceed anything we have encountered before. The immense scale of data processing, real-time analytics, and high-performance computing (HPC) requirements of next-generation applications such as artificial intelligence, 6G, mixed reality, autonomous vehicles, and the Internet of Things (IoT) are pushing the boundaries of current technology. 

Growing demands of AI on infrastructure

The transition to the next generation of computing will dramatically increase the demands on artificial intelligence (AI) infrastructure. Traditional computing and storage solutions are reaching their limits, especially as we push towards quantum computing and beyond. AI models will need to scale up and out to support next-generation applications, which require real-time data processing and analytics at unprecedented speeds. 

Challenges of existing approaches to large-scale demands

Scaling computing systems for this new era involves overcoming significant challenges, particularly in terms of data processing and infrastructure. As these systems become more complex and with increasing demands, ensuring scalability without compromising performance is vital. 

Challenge: Limits of traditional databases (RDBMS, ANSI SQL, NoSQL)

But there’s more to it than just bigger, more powerful servers and faster communications pipelines. Many of these new projects will be reliant on databases. While databases today need sub-millisecond latency and scalable infrastructure for some high-end applications such as telecommunications and 6G communication, next-generation networks will have this requirement for many tasks. Traditional databases will not be able to meet the demands of massive scale and speed, and that will be particularly true in environments such as telco and 6G.

One example is the predicted increasing use of virtual reality, augmented reality, and other forms of mixed reality, says Dr. Theresa Melvin, Aerospike’s Principal Consultant specializing in HPC/AI integrations. “The problem the telecommunications industry was trying to solve is that, by 2030, there will be approximately 800 million, maybe more, mixed reality headsets. Nobody knows for sure — it just depends on whether industry and manufacturing adopt these smart solutions.”

Combined with AI, that means even heavier demands. “Think of a plane flying through the air—the goal is to fix that plane in flight using mixed-reality headsets,” Melvin says. “To make that happen, you need an end-to-end (artificial intelligence) pipeline capable of sub-10 millisecond speeds. This requires an ingest tier capable of nanosecond speeds.”

A glimpse at the problems of the future was apparent during the COVID-19 pandemic, Melvin says. “What I found out while working at HP, especially when COVID hit and high-performance computing labs were taken over by the US government for vaccine trial acceleration, is that only so many databases can actually run on high-performance computers,” she says. “Some of them quite literally can’t scale up—they can only scale out. When you need them to scale up, they can't.”

Solution: Aerospike and high-performance computing

Members of the telecommunications industry sponsored Melvin’s research on an end-to-end HPC exascale AI pipeline that included all the systems and processes involved in managing and processing data at exascale levels. It was intended specifically for building out a 6G use case to help telecommunications companies design their networks for mixed reality and to build out that infrastructure in preparation for the next generation of wireless communication networks. 

The HPC exascale pipeline Melvin developed was engineered to enable real-time decision-making and predictive analytics — capabilities critical for expected applications such as autonomous vehicles and augmented reality. The challenge isn't just about supporting this massive number of devices but also about delivering the ultra-low latency and high throughput required to power the experiences users expect from these devices.  

"When a customer encounters a network issue, everything that hits that network for the customer is perceived as a telco problem, even though that headset does not belong to the telco vendor," Melvin says. To ensure satisfactory user experiences, telecommunications providers need to "head off those issues before the customer is even aware of them." 

Melvin says traditional computing and storage solutions, which have served us well until now, are becoming increasingly inadequate, which is why HPC is required.

"Classical networks are at the theoretical max of what they can do,” Melvin explains. “As the chip infrastructure gets smaller, it automatically hits quantum limitations or reaches quantum physics.” This is a significant hurdle, and the industry must rethink its approach to building infrastructures that sustain the speed and scale required for this incoming era. 

Benchmarking success: Real-world proof points

The collaboration between Aerospike, HPE, and ScaleFlux showcases the potential of these integrated solutions. A benchmark test using Aerospike 5.7 with 25 billion unique key values on a single HPE Superdome Flex system —  with 800 cores and 30 TB of RAM —  and ScaleFlux CSDs achieved end-to-end transfer times of 400 nanoseconds. That’s approximately 2.8 million Aerospike transactions per second on a single server.

“The types of use cases we do this for are specifically for high-performance graph, high-performance generative AI, and everything that would be a version 7.1 Aerospike use case: graph, vector, generative AI, optimized GeoJSON use cases for Elasticsearch,” Melvin says. “Then we have all the additional benefits of Aerospike V7, like our optimized memory format. Aerospike doesn’t use much memory, so this system is absolutely perfect. It’s not going to use much RAM, so someone like me, who’s an AI pipeline developer, knows exactly how to architect that for something that is somewhat RAM-intensive but isn’t going to leverage the CPU quite as heavily. We look for that perfect balance, and that is exactly how you leverage these types of systems.”

Key specifications and solution building blocks

On the hardware front, Melvin used the HPE Superdome Flex server, which she praised for its architecture. It uses non-uniform memory access (NUMA), which enables a processor to gain access to its own local memory faster than to non-local memory. “In that case, you’re leveraging the NUMA architecture through and through,” Melvin says. 

“HPE’s Superdome Flex server now rebranded as the HPE Compute Scale-up Server 3200, is ideal for high-speed data ingest and processing required by AI-driven applications,” Melvin says. “This setup allows for seamless integration with supercomputers, enabling advanced AI capabilities like generative AI and real-time analytics.”

The system Melvin set up also includes computational storage devices (CSDs) from ScaleFlux for their high performance. “I use ScaleFlux as often as I can when milliseconds matter,” she says. 

Of course, networking also plays a role, which is another advantage of the HPE Superdome Flex server, because it supports higher transmission speeds, Melvin says. “You've also got 100 GbE+ network connectivity or InfiniBand connectivity,” she says. “For an HPC audience, this particular hardware supports InfiniBand as well as other network architectures. It may be the only architecture that allows you to ingest directly from TCP/IP and egress to InfiniBand.” Combine that with NUMA, and “you still have the networking available for the client traffic and can still use InfiniBand for external deep learning,” she adds. 

Future developments promise even higher performance. “You saw the old Superdome Flex that we just tested in the test bed,” Melvin says. “The new version with Sapphire Rapids —  Intel’s fourth-generation processors —  “has a much smaller footprint. It would go from one frame to five frames, where we would have 80 processors, as many as 9600 cores, and two petabytes in a single footprint. With internal storage, with the Electra 4140s, it would be a much cheaper storage option. One of those would have two petabytes in a single unit. It would make much more sense to do a distributed S3-compliant storage format, which all kinds of vendors could tap into.”

These benchmarks underscore Aerospike's capabilities when paired with state-of-the-art hardware. “With HPE Superdome Flex, the latest and greatest system, you’ve got 16 processors on a single frame: 960 cores, 32 TB of RAM, and 1900 threads,” Melvin says. “Aerospike would just go nuts with that. It would have so much fun with that.” 

Combined with InfiniBand, this gives Aerospike the nanosecond ingest speed future applications will require, Melvin says. “This requires an ingest tier capable of nanosecond speeds. This is not your typical key-value store. This is why I brought Aerospike to the telco industry's attention. This is also not a scale-out use case for commodity hardware,” she says. “This is why I love Superdome Flex, now CSS 3200. It allows me to pipeline, so I can ingest using Aerospike at incredibly high speeds and egress directly to my supercomputers, doing all my GPU deep learning algorithm work on the back end for all my high-performance computing engineers.”

HPE isn’t the only server manufacturer that could do this, Melvin notes. “Aerospike is vendor-agnostic, so I definitely want to emphasize that,” she says. “This can be built for anyone.”

These benchmarks are not just theoretical; they represent actionable insights for organizations looking to scale. The ability to handle tens of billions of transactions per second with sub-millisecond latencies is crucial for the systems that will drive future innovations.

The future era

As we look towards the future, the development of networks will be shaped by the ability to scale effectively. Organizations such as Aerospike are leading the way, providing the tools and infrastructure necessary to support the next generation of applications. Collaboration between industry leaders will be crucial in building powerful and scalable systems.

Scaling for the next era requires a comprehensive approach that includes advanced data infrastructure and cutting-edge technology. By using Aerospike’s high-performance database and the latest HPC innovations, organizations can position themselves at the forefront of this technological revolution, ready to meet the demands of the future.

Real-time Data Summit

This industry event aims to advance the market and equip developers for the rapid growth of real-time data, leveraging its power for AI. Watch all sessions, now on-demand.