---
title: "Horizontal pod autoscaling"
description: "Autoscale Aerospike clusters on Kubernetes using resource utilization or Prometheus metrics with KEDA."
---

# Horizontal pod autoscaling

> For the complete documentation index see: [llms.txt](https://aerospike.com/docs/llms.txt)
> 
> All documentation pages available in markdown.

This page explains how to autoscale an Aerospike cluster with Kubernetes Horizontal Pod Autoscaler (HPA). You can scale from CPU or memory utilization thresholds or from Aerospike Database metrics exposed through Prometheus and KEDA. For more details, see the [official HPA documentation](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale).

-   [Scaling on resource utilization metrics](#scaling-on-resource-utilization-metrics)
-   [Scaling on Prometheus metrics from Aerospike Database](#scaling-on-prometheus-metrics-from-aerospike-database)

## Scaling on resource utilization metrics

An Aerospike cluster can be scaled based on the CPU or memory resource usage of its cluster pods.

### Prerequisites

-   Aerospike Kubernetes Operator (AKO) 4.0 or later installed.
-   Aerospike Cluster: The pod spec for the Aerospike container should define CPU and memory resource requests. For more details, see the [Aerospike cluster configuration](https://aerospike.com/docs/kubernetes/4.4.x/reference/config-reference#aerospike-container).

### Deploy HPA

1.  Create a new YAML file to act as your HPA object. In this example, the file is `hpa-cpu-scaler.yaml`. You can use any name for the file.
    
2.  Add the HPA configuration parameters to the file. The following example file instructs HPA to scale the Aerospike cluster if the average CPU usage of the `aerospike-server` container exceeds 60%. The `spec.metrics` section contains a `containerResource` subsection where this information is defined. The `spec.scaleTargetRef.name` field must reference your Aerospike Database cluster.
    
    Example configuration for hpa-cpu-scaler.yaml
    
    ```yaml
    apiVersion: autoscaling/v2
    
    kind: HorizontalPodAutoscaler
    
    metadata:
    
      name: example-hpa
    
      namespace: aerospike
    
    spec:
    
      minReplicas: 2
    
      maxReplicas: 5
    
      behavior:
    
        scaleUp:
    
          stabilizationWindowSeconds: 300
    
      metrics:
    
      - type: ContainerResource
    
        containerResource:
    
          name: cpu
    
          container: aerospike-server
    
          target:
    
            type: Utilization
    
            averageUtilization: 60
    
      scaleTargetRef:
    
        apiVersion: asdb.aerospike.com/v1
    
        kind: AerospikeCluster
    
        name: aerocluster
    ```
    
3.  Run `kubectl apply -f FILE_NAME` to create an HPA object in the same namespace as the workload you want to scale.
    
    Applying hpa-cpu-scaler.yaml with kubectl
    
    ```bash
    kubectl apply -f hpa-cpu-scaler.yaml
    ```
    
4.  Verify that Kubernetes created the HPA.
    
    Terminal window
    
    ```bash
    kubectl get hpa -n aerospike
    ```
    
    Confirm that the HPA targets your `AerospikeCluster` and reports current metrics.
    

After the file is applied, HPA automatically scales the Aerospike Database cluster when the scaling threshold is met.

## Scaling on Prometheus metrics from Aerospike Database

AKO also supports scaling based on the Aerospike Database metrics exposed to Prometheus. Scaling requires the Kubernetes Event-Driven Autoscaler ([KEDA](https://keda.sh/)) tool, which enables HPA to read and scale based on these metrics. KEDA connects directly to Prometheus as a metrics source and uses PromQL queries to define scaling thresholds.

### Prerequisites

-   Aerospike Kubernetes Operator (AKO) 4.0 or later installed.
-   Aerospike Cluster: Deployed and operational, with the [Aerospike Prometheus Exporter](https://aerospike.com/docs/kubernetes/4.4.x/observe/clusters#expose-metrics-for-prometheus) running as a sidecar container.
-   KEDA Installed: Follow the [KEDA installation guide](https://keda.sh/docs/2.16/deploy/).
-   Prometheus Installed: Should be collecting Aerospike custom metrics.
-   [Helm](https://helm.sh/docs/intro/install) installed.

### Deploy monitoring stack

A monitoring stack must be installed to enable HPA to use Aerospike metrics exposed by the Aerospike Prometheus Exporter.

For detailed instructions on setting up the monitoring stack, see: [Aerospike Kubernetes Operator Monitoring](https://aerospike.com/docs/kubernetes/4.4.x/observe/operator-monitoring).

### Install and configure KEDA

1.  Run the following Helm commands to install KEDA.
    
    Terminal window
    
    ```bash
    helm repo add kedacore https://kedacore.github.io/charts
    
    helm repo update
    
    helm install keda kedacore/keda --namespace keda --create-namespace
    ```
    
2.  Create a KEDA `ScaledObject` YAML file. In this example, the file is named `scaledObject.yaml`. You can use any name for the file.
    
3.  Add the `ScaledObject` configuration parameters to the file. This example scales the Aerospike cluster based on the `aerospike_namespace_data_used_pct` metric exposed by the Prometheus server at `http://aerospike-monitoring-stack-prometheus:9090`.
    
    Example scaledObject.yaml file
    
    ```yaml
    apiVersion: keda.sh/v1alpha1
    
    kind: ScaledObject
    
    metadata:
    
      name: aerospike-scale
    
      namespace: aerospike
    
    spec:
    
      advanced:
    
        horizontalPodAutoscalerConfig:
    
          name: keda-hpa-aerospike-scale
    
          behavior:
    
            scaleUp:
    
              stabilizationWindowSeconds: 300
    
      scaleTargetRef:
    
        apiVersion: asdb.aerospike.com/v1
    
        kind: AerospikeCluster
    
        name: aerocluster
    
      minReplicaCount: 2
    
      maxReplicaCount: 5
    
      triggers:
    
      - type: prometheus
    
        metricType: Value
    
        metadata:
    
          serverAddress: http://aerospike-monitoring-stack-prometheus:9090
    
          metricName: aerospike_namespace_data_used_pct
    
          query: |
    
            avg(aerospike_namespace_data_used_pct{ns="test1"})
    
          threshold: "50"
    ```
    
4.  Apply this file with `kubectl apply -f FILE_NAME` to create a KEDA ScaledObject in the same namespace as the Aerospike cluster you want to scale.
    
    Applying scaledObject.yaml
    
    ```bash
    kubectl apply -f scaledObject.yaml
    ```
    
    After the file is applied, KEDA automatically creates an HPA instance that scales the Aerospike Database cluster when the scaling threshold is met.
    
5.  Verify that HPA is automatically created by KEDA.
    
    Example showing the command to verify HPA creation
    
    ```bash
    kubectl get hpa -n aerospike
    
    NAME                       REFERENCE                      TARGETS     MINPODS   MAXPODS   REPLICAS   AGE
    
    keda-hpa-aerospike-scale   AerospikeCluster/aerocluster   25/50        2         5         5          22h
    ```
    

### Key configuration parameters

These are the most common parameters used in an autoscaler deployment.

-   [`stabilizationWindowSeconds`](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#stabilization-window): Prevents rapid scaling up when metric values fluctuate. This is useful to avoid unnecessary scaling during temporary spikes, such as migrations.
-   `scaleTargetRef`: Specifies the Aerospike cluster that needs to be scaled.
-   `minReplicaCount` & `maxReplicaCount`: Defines the minimum and maximum number of replicas the HPA can scale between. Ensure that `minReplicaCount` is always greater than or equal to the minimum replication factor of all namespaces in the Aerospike cluster.
-   `triggers`: Uses a [Prometheus-based trigger](https://keda.sh/docs/2.16/scalers/prometheus) to scale the Aerospike cluster based on the `aerospike_namespace_data_used_pct` metric.
-   `query`: The PromQL query to fetch the metric value. In this example, the query fetches the average value of the `aerospike_namespace_data_used_pct` metric for the `test1` namespace.

HPA also includes other parameters. For more details, see [HPA Scaling Behavior](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#configurable-scaling-behavior).

### Example: rack-specific querying

The previous example used the `avg(aerospike_namespace_data_used_pct{ns="test1"})` query to get an average value of namespace data used across all racks. You can also scale based on a certain metric for a specific namespace or specific rack in the Aerospike cluster.

Consider a cluster with two racks and four nodes:

-   Rack 1 contains namespace `test` (Nodes: n1, n2)
-   Rack 2 contains namespaces `test` and `test1` (Nodes: n3, n4)

To query a single namespace, use the following command in the `spec.triggers.metadata.query` parameter of the `scaledObject.yaml` file:

```yaml
aerospike_namespace_data_used_pct{ns="test1"}
```

This query fetches the data usage percentage for the namespace `test1`. Since `test1` exists only in Rack 2, the result is calculated as the average of the `data_used_pct` value on nodes `n3` and `n4`: `(n3_data_used_pct + n4_data_used_pct) / 2`.

If the query result crosses the defined threshold, the cluster scales up. Scaling continues until the desired rack, in this case Rack 2, receives new nodes, which reduces the query result under the threshold.

### Commonly used metrics for scaling

The most effective metrics for autoscaling differ based on workload size and performance goals in different deployments. Monitor your traffic, latency, and throughput regularly, then set thresholds that avoid inefficient scaling.

The following table contains some recommended metrics for scaling an Aerospike Database cluster. For more details, refer to the [Aerospike metrics](https://aerospike.com/docs/database/reference/metrics).

| Metric Name | Description |
| --- | --- |
| [aerospike\_namespace\_data\_used\_pct](https://aerospike.com/docs/database/reference/metrics#namespace__data_used_pct) | Percentage of used storage capacity for this namespace. |
| [aerospike\_namespace\_indexes\_memory\_used\_pct](https://aerospike.com/docs/database/reference/metrics#namespace__indexes_memory_used_pct) | Percentage of combined RAM indexes’ size used. |
| [aerospike\_namespace\_index\_mounts\_used\_pct](https://aerospike.com/docs/database/reference/metrics#namespace__index_mounts_used_pct) | Percentage of the mount(s) in-use for the primary index used by this namespace. |
| [aerospike\_namespace\_sindex\_mounts\_used\_pct](https://aerospike.com/docs/database/reference/metrics#namespace__sindex_mounts_used_pct) | Percentage of the mount(s) in-use for the secondary indexes used by this namespace. |
| [aerospike\_node\_stats\_process\_cpu\_pct](https://aerospike.com/docs/database/reference/metrics#node_stats__process_cpu_pct) | Percentage of CPU usage by the `asd` process. |
| [aerospike\_node\_stats\_system\_kernel\_cpu\_pct](https://aerospike.com/docs/database/reference/metrics#node_stats__system_kernel_cpu_pct) | Percentage of CPU usage by processes running in kernel mode. |
| [aerospike\_node\_stats\_system\_total\_cpu\_pct](https://aerospike.com/docs/database/reference/metrics#node_stats__system_total_cpu_pct) | Percentage of CPU usage by all running processes. |
| [aerospike\_node\_stats\_system\_user\_cpu\_pct](https://aerospike.com/docs/database/reference/metrics#node_stats__system_user_cpu_pct) | Percentage of CPU usage by processes running in user mode. |