Estimate resource usage for absctl backup
absctl backup can use significant resources depending on the options that you set when running the tool.
We recommend running the tool on a separate host or container from Aerospike Database.
The command-line flags used in the following formulas are defined in Run Aerospike backup.
Estimate memory usage for a backup
absctl backup’s memory usage is largely affected by the --parallel flag.
By default, absctl backup distributes work across threads by assigning each thread a unique file to write to.
These files each have a constant size buffer associated with them.
The file buffer’s default value is 5MiB, the same as the default cloud provider chunk size.
The internal pipeline buffer that processes records is 256 records. Multiply this value by the average record size to get the internal pipeline buffer size.
Formula to approximate memory usage for backup
To calculate the approximate amount of memory required for a backup, multiply the value that you set for the --parallel flag by the buffer size (controlled by --std-buffer or --local-buffer-size).
The result is the approximate amount of memory needed to back up your data.
If you are backing up to cloud storage (S3, GCP, Azure), more memory will be used to maintain the internal client and buffers.
Estimate disk space for a backup
To estimate how much disk space is needed to back up your data, use the --estimate flag when running absctl backup.
When you use the --estimate flag, you may not use --parallel. The two flags are mutually exclusive.
absctl backup --namespace NAME --estimateThe --estimate flag reads 10,000 records by default from the specified namespace and prints the average size of the sampled records.
The number of records can be changed with --estimate-samples.
To estimate the amount of disk space needed for a backup:
-
Multiply the estimated record size, returned by the
--estimateflag, by the number of records in the namespace. -
Add 10% to account for backup file overhead.
The result is the approximate disk space needed for a backup.
Server-side query thread usage
When absctl backup runs queries against an Aerospike cluster, it uses server-side query threads to process the requests.
Understanding how these threads are allocated helps you tune both the backup tool and the database for optimal performance.
How query threads work
Each backup query initiated by --parallel runs on the server.
The server allocates threads from a shared pool to process these queries.
| Server configuration | Default | Description |
|---|---|---|
query-threads-limit | 128 | Maximum total threads for all queries across the server |
single-query-threads | 4 | Maximum threads per individual query |
With the defaults, each query uses up to 4 threads, allowing approximately 32 concurrent queries (128 ÷ 4) before queries start waiting for threads.
Relationship with --parallel
The --parallel flag controls how many concurrent queries absctl backup runs.
Each query consumes server-side query threads:
- If
--parallelexceedsquery-threads-limit ÷ single-query-threads, some queries will wait for threads. - For a cluster with multiple nodes, the thread limits apply per node.
Example: With defaults (query-threads-limit=128, single-query-threads=4) and --parallel 8:
- Each of the 8 parallel queries uses up to 4 threads
- Total thread usage: up to 32 threads per node (8 × 4), well within the 128-thread limit
Tuning recommendations
For backup-heavy workloads, consider adjusting server-side settings:
| Goal | Adjustment |
|---|---|
| Support more concurrent queries | Increase query-threads-limit |
| Make individual queries faster | Increase single-query-threads |
| Limit backup impact on other queries | Decrease query-threads-limit or use --records-per-second to throttle |
To view current query thread settings, use:
asinfo -v "get-config:context=service" | tr ';' '\n' | grep query-threadsasinfo -v "get-config:context=namespace;id=NAMESPACE" | tr ';' '\n' | grep query-threadsTo adjust dynamically:
asinfo -v "set-config:context=service;query-threads-limit=256"asinfo -v "set-config:context=namespace;id=NAMESPACE;single-query-threads=8"Calculate number of file descriptors
absctl backup may need to open many backup files and network sockets.
If absctl backup cannot open the required number of file descriptors, it can fail with “too many open files” errors.
By default, absctl backup opens a new backup file for each of its --parallel threads.
Each thread may have to open a network socket to each node in the cluster.
To approximate the maximum number of file descriptors needed for a backup:
-
Set
Nto the value of--parallel. -
Set
Cto the number of nodes in the cluster. -
Estimate file descriptors as (N) (output files) + (N \times C) (network sockets) + a small constant for overhead (logs, DNS, etc).
This estimate is intentionally conservative. If you are backing up to cloud storage, increase the estimate to account for additional connections used for upload concurrency.