Skip to main content
Version: Operator 3.3.0

Scaling Aerospike on Kubernetes

You can scale Aerospike on Kubernetes horizontally (adjusting the number of pods or nodes in your cluster), or vertically (adjusting the resources available to them). See Horizontal scaling and Vertical scaling for more information and instructions.

Horizontal scaling


The Custom Resource (CR) file controls the number of pods (nodes) on the rack. When you change the cluster size in the CR file, the Operator adds pods to the rack following the rack order defined in the CR.

The Operator distributes the nodes equally across all racks. If any pods remain after equal distribution, they are distributed following the rack order.

Consider a cluster of two racks and five pods. After equal pod distribution, both racks would have two pods, with one left over as a remainder. The remaining pod would go to Rack 1, resulting in Rack 1 having three pods and Rack 2 having two pods. If the cluster size is scaled up to six pods, a new pod would be added to Rack 2.

Scaling down follows the rack order and removes pods with the goal of equal distribution.

In above example of two racks and six pods, scaling down to four pods will result in two racks with two pods each. The third pod (third replica) on Rack 1 goes down first, followed by the third pod on Rack 2.

Horizontal scaling CR parameters

For this example, the cluster is deployed using a CR file named aerospike-cluster.yaml.

Change the spec.size field in the CR file to scale the cluster up or down to the specified number of pods.

kind: AerospikeCluster
name: aerocluster
namespace: aerospike
size: 2
image: aerospike/aerospike-server-enterprise:

Use kubectl to apply the change.

kubectl apply -f aerospike-cluster.yaml

Check the pods.


$ kubectl get pods -n aerospike
aerocluster-0-0 1/1 Running 0 3m6s
aerocluster-0-1 1/1 Running 0 3m6s
aerocluster-0-2 1/1 Running 0 30s
aerocluster-0-3 1/1 Running 0 30s

Batch scale-down

You can scale down multiple pods in the same rack with a single scaling command by configuring scaleDownBatchSize in the CR file. This parameter is a percentage or absolute number of rack pods that the Operator scales down simultaneously.


Batch scale-down is not supported for Strong Consistency (SC) clusters.

Horizontal autoscaling

Kubernetes Cluster Autoscaler is a Kubernetes component that automatically scales up the Kubernetes cluster when resources are insufficient for the workload and scales down the cluster when nodes are underused for an extended period of time. See the documentation at the GitHub link for more details.

Karpenter is an autoscaling tool for Kubernetes deployments on AWS, with certain features designed to fit into an AWS workflow. See the Karpenter documentation for more details.

If Aerospike pods only have in-memory and dynamic network-attached storage, both autoscalers scale up and down by adjusting resources and shifting load automatically to prevent data loss.

Horizontal autoscaling with local volumes

The primary challenge with autoscalers and local storage provisioners is ensuring the availability of local disks during Kubernetes node startup after the autoscaler scales up the node. When using local volumes, the ability to successfully autoscale depends on the storage provisioner: the default static local storage provisioner built into Kubernetes, or OpenEBS.


Do not use multiple storage provider provisioners such as OpenEBS and gce-pd simultaneously. If your individual setup requires the use of an additional provisioner alongside OpenEBS, configure OpenEBS with exclusion filters to prevent other disks from being consumed by OpenEBS.

The Kubernetes cluster autoscaler cannot add a new node when the underlying storage uses the Kubernetes static local storage provisioner. Scale up in this case must be done manually.

When scaling up using Karpenter, OpenEBS automatically gets a newly-provisioned node running only if your cluster has a way to set up local storage (bootstrapping) as soon as a new node becomes active in a cluster. See Google's documentation for automatic bootstrapping on GKE for more information and a setup guide.

Neither autoscaler can scale down the nodes if any pod is running with local storage attached.

Vertical scaling

Vertical scaling refers to adjusting the compute resources, such as CPU and memory, allocated to existing pods in a Kubernetes cluster. This can be useful if applications experience variable workloads that require more or less computing power at different times, such as peak and off-peak traffic times requiring changes in the amount of memory.


Vertical scaling uses the Aerospike rack awareness feature. See Rack Awareness Architecture for more details.

The AerospikeContainerSpec parameter in the CR file governs the amount of compute resources (CPU or memory) available to each pod. Modifying this parameter causes a rolling restart of any pods with updated resources.

Kubernetes also provides an autoscaler called Vertical Pod Autoscaler (VPA), which automatically sets resource requests based on usage. VPA can both downscale pods that are over-requesting resources and upscale pods that are under-requesting resources based on their usage over time.