Aerospike Backup Service performance tuning

For the complete documentation index see: llms.txt

All documentation pages available in markdown.

This page describes how to tune the Aerospike Backup Service (ABS) process for maximum throughput, compression ratio, and resource efficiency.

All performance tuning decisions must balance the two stages of the backup process:

Readers read records from Aerospike using primary-index queries.
Writers serialize, compress, encrypt, and move the data to storage.

What performance tuning means for ABS

Performance tuning for ABS means choosing configuration parameter values that finish backups faster without overloading the Aerospike cluster, the ABS host, or the storage path. The best setting depends on your goal:

Faster backups usually require more read or write parallelism.
Lower cluster impact usually requires read throttling or lower read parallelism.
Smaller backup files usually require more compression CPU.
Lower memory usage usually requires fewer writers or smaller upload buffers.

Tune one bottleneck at a time. Pick a goal, check the pipeline metric to find whether readers or writers are limiting throughput, adjust the parameters for that stage, and validate the result with representative data. Stop increasing concurrency when throughput gains become marginal or when CPU, memory, cluster load, or storage throttling become limiting factors.

Backup process architecture

Each backup runs a pool of reader workers and a pool of writer workers, connected by buffered Go channels.

ABS publishes a metrics.pipeline value that reports how full these channel buffers between readers and writers are.

If readers fill the channels faster than writers drain them, the channels fill up and pipeline approaches capacity. If writers drain faster than readers fill, the channels stay empty and pipeline stays near 0.

Readers (controlled by `parallel`):

Open a primary-index query against an Aerospike node for a partition range.
Receive records using Aerospike binary protocol over TCP.
Push each record into a channel buffer with capacity 256 per reader.

The constraint for readers is usually the query capacity of the Aerospike Database nodes. Common slowdown causes include node I/O bottlenecks, CPU contention from production read/write traffic, and network saturation between ABS and the cluster. A cluster over capacity results in FAIL_FORBIDDEN errors in the ABS logs.

Writers (controlled by `parallel-write`):

Pull a record from the channel buffer.
Encode it into the .asb binary format.
Compress with ZSTD if enabled.
Encrypt if enabled.
Write the encoded bytes into an in-memory buffer until it reaches the size limit set by min-part-size.
Upload the buffer as one part of a multipart upload to object storage.
When cumulative bytes reach file-limit, complete the current multipart upload and start a new file.

A writer stalls when any step in this chain is a bottleneck. The most common bottlenecks are CPU-bound ZSTD compression at high levels, network-bound storage uploads to S3, GCS, or Azure, and memory pressure from large upload buffers triggering garbage collection pauses.

The constraint for writers is usually CPU or storage. More writers running ZSTD on the same machine means more goroutines competing for the same cores. Each writer also allocates memory for its channel buffer, upload buffer, and encoder state. Total memory grows linearly with the number of writers and can overflow the container’s memory limit. Storage backends can throttle when they receive too many concurrent upload requests.

Always check the pipeline metric to identify the bottleneck stage before adding threads to either side.

Checking the pipeline metric

Before changing any configuration, check the pipeline metric during a running backup:

curl http://ABS_HOST:8080/v1/backups/currentBackup/ROUTINE_NAME | \
  jq '{full: .full.metrics.pipeline, incremental: .incremental.metrics.pipeline}'

The response has separate full and incremental blocks, one per running job. The metrics.pipeline field inside each block reports the total number of records sitting in the channel buffers between readers and writers for that job.

Pipeline capacity equals 256 × parallel + 256 × parallel-write. When parallel-write is not set, it defaults to parallel, so the capacity simplifies to 512 × parallel.

Configuration	Capacity
`parallel=8` (default, both read and write)	256×8 + 256×8 = 4096
`parallel=4`, `parallel-write=8`	256×4 + 256×8 = 3072
`parallel=4`, `parallel-write=4`	256×4 + 256×4 = 2048

A healthy pipeline can briefly reach 0 or capacity during normal fluctuations. In a busy system, watch for sustained readings near either extreme.

Pipeline stays near 0: Writers are idle, waiting for data. Readers are the bottleneck.
Pipeline fluctuates across the range: Neither stage is a clear bottleneck. The configuration is near optimal.
Pipeline stays near capacity: Readers are blocked because the buffers are full. Writers are the bottleneck.

Tuning workflow

Start with defaults. Set parallel: 8. Leave parallel-write unset so it inherits the parallel value. Enable ZSTD at compression.level: 3 to use the Default preset. Keep file-limit: 250.
Monitor the pipeline metric. Check GET /v1/backups/currentBackup/ROUTINE_NAME during a backup run.
Match the pipeline reading to a stage in Backup process architecture:
- Stays near 0: Readers are constrained by query latency, network limits, or a low parallel setting. Increase parallel and check cluster health.
- Fluctuates across the range: Readers and writers are balanced. Configuration is close to optimal.
- Stays near capacity: Writers are constrained by compression CPU, upload latency, or a low parallel-write setting. Increase parallel-write, reduce the compression level, or increase storage bandwidth.
Check system resources.
- CPU > 90%: Decrease parallel-write or lower the compression level.
- RAM high: Decrease parallel-write or min-part-size.
Iterate and validate with short tests that use representative data. Steady-state throughput is reached within seconds of backup start, so test runs of 2–5 minutes are often enough to observe it.
Scale horizontally. If a single ABS instance is fully saturated, use partition-list slicing to split workload across multiple instances.

Tuning parameters

This section describes in more detail the architecture and parameters that control reader and writer bottlenecks.

Reader bottlenecks

Use these parameters when pipeline stays near 0, which means writers are waiting for records.

`parallel` (read parallelism)

Controls the number of concurrent reader threads issuing primary-index queries against the Aerospike cluster.

When to increase: If pipeline stays near 0, increase parallel incrementally until readers stop being the bottleneck. Throughput gains taper off once readers saturate the cluster’s query capacity. Stop increasing when gains are marginal.

When to decrease: You see FAIL_FORBIDDEN errors in ABS logs, which means the cluster’s query thread capacity is exhausted.

Default: 8

`records-per-second` (RPS throttle)

Limits the number of records read from Aerospike per second across all readers. Use this to limit ABS’s impact on a production cluster without changing parallel.

When to set: Production database latency or resource usage rises during backups, but you still want to keep enough read parallelism to scan partitions efficiently.

When to increase: Backups are taking too long, the Aerospike cluster has headroom, and pipeline stays near 0 because writers are waiting for records.

When to decrease: Backups are competing with production traffic, or cluster CPU, disk, or network usage rises beyond the level you want to allow for backup work.

Default: no limit. Omitting the parameter or setting it to 0 disables the RPS throttle.

records-per-second limits returned records, not query concurrency. If the cluster is rejecting primary-index queries or running out of query threads, reduce parallel or lower max-parallel-scans instead.

Writer bottlenecks

Use these parameters when pipeline stays near capacity, which means readers are waiting for writers.

`parallel-write` (write parallelism)

Controls the number of threads for serialization, compression, encryption, and uploading. If not set, it defaults to the value of parallel.

When to increase: Pipeline is near capacity and you have CPU and memory headroom.

When to decrease: CPU is saturated above 90% or memory usage is high.

Default: the value of parallel.

Memory impact: Each writer allocates:

A 256-record channel buffer
A storage upload buffer sized to min-part-size, which defaults to 5 MiB
ZSTD encoder state if compression is enabled, sized between about 1 MiB and 8 MiB depending on preset

Reader and writer counts do not need to match. Set parallel-write based on writer bottlenecks and available CPU, memory, and storage bandwidth. ABS handles distribution between readers and writers.

`compression` (ZSTD)

ABS applies ZSTD compression per writer thread. The compression.level parameter accepts values from -1 through 22.

ABS uses a Go implementation of ZSTD that provides four rather than 22 discrete levels of compression. The library maps any numeric level to one of four presets. Values within the same bucket produce identical compression output, speed, and CPU usage. For meaningful configuration changes, use values at preset boundaries: 1, 3, 6, and 10.

Level	Preset	Approx. zstd equivalent	Behavior
-1, 0–2	Fastest	zstd 1	Least CPU, largest files
3–5	Default	zstd 3	Balanced speed and compression
6–9	Better	zstd 7–8	More CPU, smaller files
10–22	Best	zstd 11	Most CPU, smallest files

If compression.level is omitted, the API default is 0. For balanced compression, set compression.level to 3.

Practical tips:

Start with compression.level=3 to use the Default preset. You can also experiment with compression level 0 to get a baseline for speed and size when testing.
Compression helps most on high-latency or low-bandwidth storage paths when records are highly compressible.
If writer CPU saturates or pipeline trends toward capacity after enabling a more aggressive preset, reduce compression.level.
Move from level 3 to 6 only after validating with your own baseline runs.

`file-limit` (max file size)

Controls the maximum size of individual .asb backup files, in MiB. The default is 250 MiB.

When a file reaches this limit, ABS starts writing a new file. Set to 0 for no limit, which produces one file per writer.

`min-part-size` (multipart upload chunk size)

Controls the minimum size of each part in a multipart upload to object storage. ABS buffers this amount of data in memory per writer before uploading a part.

The default is 5 MiB (5,242,880 bytes).

Minimum values vary by storage backend:

Backend	Minimum `min-part-size`
S3	5 MiB
GCS	256 KiB
Azure Blob	1 MiB

Memory impact: This parameter is the primary driver of upload buffer memory. Peak memory from chunk buffers alone is approximately parallel-write × min-part-size.

Larger min-part-size values reduce the number of chunk upload requests, which can improve throughput on high-latency connections. However, each writer holds this buffer in memory, so large values with high parallel-write risk OOM conditions.

`bandwidth` (write speed cap)

Throttles the total encoded backup write speed in MiB/s across all writers. Use this to prevent ABS from saturating a shared network connection.

When to increase: pipeline stays near capacity because the configured write cap is lower than the storage path can support, and the ABS host, network, and storage backend all have headroom.

When to decrease: Backup uploads are competing with other traffic, or the storage backend starts throttling concurrent writes.

Default: no limit. Omitting the parameter or setting it to 0 disables the bandwidth cap.

The limit is shared by the writer pipeline. Increasing parallel-write can improve serialization, compression, encryption, and upload concurrency, but it cannot raise throughput above the configured bandwidth value.

With bandwidth limiting, the writer stage runs at a fixed rate. If readers can produce faster than that cap, the pipeline trends toward capacity; if not, readers remain the bottleneck and pipeline can stay low.

Aerospike Backup Service performance tuning

What performance tuning means for ABS

Backup process architecture

Readers (controlled by parallel):

Writers (controlled by parallel-write):

Checking the pipeline metric

Tuning workflow

Tuning parameters

Reader bottlenecks

parallel (read parallelism)

records-per-second (RPS throttle)

Writer bottlenecks

parallel-write (write parallelism)

compression (ZSTD)

file-limit (max file size)

min-part-size (multipart upload chunk size)

bandwidth (write speed cap)