Blog

Understanding elasticity and scalability in cloud computing

Explore the differences between scalability and elasticity in cloud computing and discover strategies to enhance system performance.

January 15, 2025 | 10 min read

DevOps teams must understand scalability and elasticity to build systems that adapt to growth and workload changes, maintain performance, and effectively control costs.

Scalability is a key component of system design that enables a system, network, application, or process to efficiently manage growing workloads and/or increased demand by adding resources to existing infrastructure. Elasticity is closely related to scalability, but the focus is on automation that allows resources to be added or removed seamlessly to align with fluctuating or expanding workloads.

While this blog post provides introductory material, the book Aerospike: Up and Running offers practical guidance for DevOps professionals on scaling and elasticity strategies in Aerospike's distributed architecture. It explains how to implement both vertical and horizontal scaling to optimize resource use while maintaining low-latency performance and high availability. The book also covers best practices for managing transactions in demanding workloads, providing actionable insights for building and maintaining scalable, reliable systems.

Vertical scaling

Vertical scaling, often referred to as "scaling up," involves increasing the capacity of an existing server or resource. This can be achieved by adding more powerful hardware, such as faster CPUs, additional memory, or increased storage. 

Example of vertical scaling

For instance, if a database server struggles with high traffic, vertical scaling might involve upgrading its RAM from 16 GB to 64 GB or replacing its CPU with a more powerful model. Cloud providers like AWS and Azure make this process straightforward, offering options to resize virtual machines based on specific requirements.

Benefits of vertical scaling

Upgrading an existing machine is a simple and direct solution to performance issues. By enhancing a server’s resources, workloads remain localized to the same machine, avoiding the complexities of provisioning a new machine and redistributing data. Many cloud providers further streamline this process by offering intuitive dashboards and APIs, allowing IT teams to scale vertically with minimal manual effort.

Limitations of vertical scaling

Despite its simplicity, vertical scaling comes with inherent limitations. Single machines have finite hardware capabilities—there’s a maximum amount of memory, CPU cores, or storage capacity that can be added. Furthermore, some upgrades necessitate downtime, during which the server becomes unavailable, potentially interrupting critical operations. 

Vertical scaling best suits centralized systems with slow, predictable growth, such as legacy relational databases or monolithic applications.

Horizontal scaling

Horizontal scaling, or "scaling out," focuses on adding more servers or machines to share the workload. Instead of upgrading a single machine, horizontal scaling distributes demand across multiple systems. Horizontal scaling is increasingly preferred in modern distributed systems, microservices architectures, and cloud-native applications. It is the dominant approach for systems that require high availability, fault tolerance, and the ability to handle consistently growing demand.

Example of horizontal scaling

A web application experiencing heavy user traffic might deploy additional web servers. With technologies like Kubernetes or Docker Swarm, these servers can be managed collectively to balance the load efficiently. For instance, scaling from one server handling 1,000 users to four servers could increase the capacity to handle 4,000 users.

Benefits of horizontal scaling

Horizontal scaling provides virtually unlimited capacity by enabling organizations to add as many servers as necessary – the only limit is the network's speed. This approach also enhances fault tolerance since workloads are distributed; the failure of one server does not compromise the entire system. More importantly, horizontal scaling aligns with modern architectural paradigms such as microservices and serverless computing, emphasizing modularity and resilience.

Running operational workloads with Aerospike at petabyte scale in the cloud on 20 nodes

Discover how to achieve sub-millisecond performance at petabyte scale while cutting costs by up to 80%. Download the Aerospike white paper developed with Intel and AWS to power real-time applications with unmatched efficiency and reliability.

Challenges of horizontal scaling

Implementing horizontal scaling introduces additional layers of complexity. Managing multiple servers requires robust infrastructure, including load balancers to distribute traffic evenly, distributed file systems to maintain data consistency, and monitoring tools to ensure performance. These requirements can lead to higher costs, not only for hardware but also for the technical expertise needed to configure and maintain the infrastructure effectively.

Horizontal scaling is ideal for distributed applications, microservices, and environments with unpredictable demand spikes, such as e-commerce platforms during sales events.

What is elasticity?

Elasticity refers to a system’s ability to adjust resources dynamically in response to varying workloads. The term likely originated with Amazon’s creation of its Elastic Compute Cloud in 2006. Elastic systems scale up during periods of high demand and scale down when demand decreases, optimizing resource utilization.

Example of elasticity

A streaming platform like Netflix might experience a surge in viewers during a popular event. Elastic cloud infrastructure automatically allocates additional servers to handle the traffic spike and reduces resources afterward. AWS Auto Scaling or Azure Scale Sets enable this flexibility based on predefined performance metrics.

Benefits of elasticity

Elasticity optimizes costs by ensuring businesses pay only for the resources they actively use, avoiding unnecessary expenses during low-demand periods. Elastic systems provide immediate responsiveness, scaling resources up or down in real time to maintain consistent performance during traffic surges. Additionally, the automation of resource adjustments improves operational efficiency and reduces the need for manual intervention, allowing IT teams to focus on more strategic priorities.

Elasticity is particularly useful for dynamic workloads, including gaming platforms, video streaming services, and applications with seasonal traffic variations.

Comparing scalability and elasticity

While scalability and elasticity overlap, they address different operational challenges.

Scalability overview

Scalability focuses on increasing the system’s capacity to handle higher workloads over time, whether by enhancing existing resources (vertical) or adding new ones (horizontal). Predictable, long-term growth scenarios benefit from scalability, often requiring careful planning and infrastructure adjustments. Tools like Kubernetes scaling, AWS EC2 Auto Scaling, and Azure VM Scale Sets provide robust options for implementing scalability in diverse environments.

Elasticity overview

Elasticity ensures a system can dynamically respond to real-time changes in workload demands by adding or removing resources as needed. Designed for unpredictable, fluctuating workloads, elasticity relies on automation based on performance metrics. Serverless architectures, such as AWS Lambda and Azure Functions, exemplify elasticity by scaling resources instantaneously without manual configuration.

Use cases for scalability

Scalability is a critical consideration across industries, enabling systems to handle growing demands without sacrificing performance or reliability. Below, we explore specific use cases where scalable infrastructures and technologies are essential.

Data warehousing

Scalable data warehouses like Amazon Redshift or Google BigQuery accommodate growing datasets required for analytics and business intelligence. As data volumes increase, additional nodes can be added to maintain performance, ensuring seamless handling of complex queries.

Real-time personalization

Wix transitioned from Redis and HBase to Aerospike, achieving 100% real-time data availability, six times faster response times, and a 45% reduction in annual costs. This migration significantly improved their platform's performance and scalability, allowing them to manage massive datasets and real-time applications more effectively.

Adobe integrated Aerospike to manage the vast and diverse datasets that underpin the personalized customer experiences provided by Adobe Experience Cloud. Aerospike scales horizontally to accommodate growing data demands, supporting millions of customer profiles and increasing transaction volumes (several million customer requests per second). The result is a real-time decision-making system with 99.999% uptime. 

Enterprise applications

ERP systems like SAP or Oracle often serve large and growing organizations. By leveraging scalable infrastructures, these applications can handle an expanding number of users or modules without compromising performance or stability.

High-performance computing

Fields like genetic modeling, weather simulations, and aerospace engineering require immense computational power. Scalable infrastructures support distributed GPU clusters or supercomputers, allowing researchers to process vast datasets and run intricate simulations efficiently.

In collaboration with Hewlett Packard Enterprise (HPE), Aerospike showcased its capabilities by running a benchmark test on an HPE Superdome Flex server. This setup achieved approximately 2.8 million transactions per second on a single server, highlighting Aerospike's ability to handle massive workloads efficiently.

Use cases for elasticity

Elasticity is key to optimizing resource usage and maintaining performance during fluctuating workloads. Below, we examine real-world scenarios where elastic systems enable organizations to adapt dynamically to changing demands without compromising efficiency or user experience.

E-commerce

Elastic platforms are essential for managing unpredictable traffic patterns in e-commerce. For example, during Black Friday sales, elasticity allows websites to scale resources dynamically, ensuring that servers can handle sudden spikes in customer activity without downtime or slow load times. 

During its annual Big Billion Days sale, Flipkart, a leading e-commerce company in India,  leveraged Aerospike’s real-time data platform to handle massive surges in traffic and transactions. Aerospike enabled Flipkart to scale resources elastically and dynamically, ensuring seamless user experiences even during peak traffic periods. This approach not only reduced latency but also optimized infrastructure costs, allowing Flipkart to maintain high performance while supporting millions of concurrent users.

Video streaming

Streaming services like Netflix, YouTube, or Hulu rely on elastic cloud infrastructures to manage fluctuating viewer demands. Whether it’s a global event or routine peak hours, elasticity ensures uninterrupted streaming experiences by dynamically adjusting server capacities.

Healthcare systems

Telemedicine platforms benefit significantly from elasticity, particularly during pandemics or flu seasons. These systems can scale to accommodate surges in patient interactions, ensuring reliable access for users without overprovisioning resources during off-peak periods.

IoT ecosystems

Smart home systems and Internet of Things (IoT) platforms handle activity spikes during events like firmware updates or new device rollouts. Elasticity ensures that these platforms can dynamically scale to support increased device communication without degrading performance.

ZoneTap, a company specializing in geofencing applications for worker safety, developed the 2ND-SKN IoT device to provide real-time alerts in hazardous environments. Initially, they used MongoDB but encountered performance bottlenecks as they scaled beyond 24 devices. By migrating to Aerospike's database, Zonetap achieved significant improvements in scalability and elasticity, decreased latency, and increased throughput. This transition enabled Zonetap to dynamically adjust resources in response to varying workloads, ensuring reliable performance and real-time data processing essential for worker safety in industrial settings.

Integrating scalability and elasticity

Scalability provides the infrastructure to grow systems over time, while elasticity ensures adaptability to immediate changes. Together, they form a robust cloud strategy that addresses diverse operational needs. For example, an e-commerce platform may leverage scalability to expand its database over time while relying on elasticity to handle daily or seasonal demand fluctuations.

Paypal, a leading financial services company utilizing Aerospike, successfully implemented both scalability and elasticity to optimize its fraud detection systems. By leveraging Aerospike’s horizontally scalable architecture, the company ensured its infrastructure could handle growing transaction volumes. Simultaneously, elastic capabilities allowed them to scale resources dynamically during periods of heightened transaction activity, such as holiday seasons, without overprovisioning during normal periods. This dual approach reduced operational costs while maintaining high-performance standards.

The Trade Desk, a global advertising technology firm that built its platform on Aerospike, leveraged scalability to handle exponential growth in data ingestion and real-time bidding workloads. By implementing Aerospike’s scalable database solution, the company processed millions of transactions per second with low latency, ensuring seamless ad placements. The scalable architecture enabled them to expand their operations across new regions without re-architecting their infrastructure, allowing for faster time-to-market and significant cost savings. This capability, further augmented by Aerospike’s support for elasticity, was instrumental in maintaining competitiveness in a high-demand, data-intensive industry.

Master scalability and elasticity with Aerospike

Building cloud systems that balance cost-efficiency and performance in equal measure requires a solid understanding of the principles of elasticity and scalability. Elasticity dynamically allocates system resources to handle fluctuations in real time, while scalability ensures that your system shifts with demand.

Aerospike’s architecture harnesses the best of both concepts, enabling seamless scaling and real-time adaptability. 

Whether managing spikes in e-commerce, optimizing fraud detection, or supporting IoT devices, Aerospike equips businesses to establish highly-performant, resilient systems, ready for dynamic workloads and heavy growth. Achieve the perfect balance of scalability and elasticity to meet today’s challenges and tomorrow’s demands.

Download Community Edition (Free version!)

Aerospike Server Community Edition (CE) is a free, open source Aerospike distribution. It is the common core of Aerospike Enterprise Edition (EE) with the same developer API and performance characteristics.