Skip to content

Estimate resource usage for absctl backup

absctl backup can use significant resources depending on the options that you set when running the tool. We recommend running the tool on a separate host or container from Aerospike Database.

The command-line flags used in the following formulas are defined in Run Aerospike backup.

Estimate memory usage for a backup

absctl backup’s memory usage is largely affected by the --parallel flag. By default, absctl backup distributes work across threads by assigning each thread a unique file to write to. These files each have a constant size buffer associated with them. The file buffer’s default value is 5MiB, the same as the default cloud provider chunk size. The internal pipeline buffer that processes records is 256 records. Multiply this value by the average record size to get the internal pipeline buffer size.

Formula to approximate memory usage for backup

To calculate the approximate amount of memory required for a backup, multiply the value that you set for the --parallel flag by the buffer size (controlled by --std-buffer or --local-buffer-size).

The result is the approximate amount of memory needed to back up your data.

If you are backing up to cloud storage (S3, GCP, Azure), more memory will be used to maintain the internal client and buffers.

Estimate disk space for a backup

To estimate how much disk space is needed to back up your data, use the --estimate flag when running absctl backup.

When you use the --estimate flag, you may not use --parallel. The two flags are mutually exclusive.

Terminal window
absctl backup --namespace NAME --estimate

The --estimate flag reads 10,000 records by default from the specified namespace and prints the average size of the sampled records. The number of records can be changed with --estimate-samples.

To estimate the amount of disk space needed for a backup:

  1. Multiply the estimated record size, returned by the --estimate flag, by the number of records in the namespace.

  2. Add 10% to account for backup file overhead.

The result is the approximate disk space needed for a backup.

Server-side query thread usage

When absctl backup runs queries against an Aerospike cluster, it uses server-side query threads to process the requests. Understanding how these threads are allocated helps you tune both the backup tool and the database for optimal performance.

How query threads work

Each backup query initiated by --parallel runs on the server. The server allocates threads from a shared pool to process these queries.

Server configurationDefaultDescription
query-threads-limit128Maximum total threads for all queries across the server
single-query-threads4Maximum threads per individual query

With the defaults, each query uses up to 4 threads, allowing approximately 32 concurrent queries (128 ÷ 4) before queries start waiting for threads.

Relationship with --parallel

The --parallel flag controls how many concurrent queries absctl backup runs. Each query consumes server-side query threads:

  • If --parallel exceeds query-threads-limit ÷ single-query-threads, some queries will wait for threads.
  • For a cluster with multiple nodes, the thread limits apply per node.

Example: With defaults (query-threads-limit=128, single-query-threads=4) and --parallel 8:

  • Each of the 8 parallel queries uses up to 4 threads
  • Total thread usage: up to 32 threads per node (8 × 4), well within the 128-thread limit

Tuning recommendations

For backup-heavy workloads, consider adjusting server-side settings:

GoalAdjustment
Support more concurrent queriesIncrease query-threads-limit
Make individual queries fasterIncrease single-query-threads
Limit backup impact on other queriesDecrease query-threads-limit or use --records-per-second to throttle

To view current query thread settings, use:

Terminal window
asinfo -v "get-config:context=service" | tr ';' '\n' | grep query-threads
asinfo -v "get-config:context=namespace;id=NAMESPACE" | tr ';' '\n' | grep query-threads

To adjust dynamically:

Terminal window
asinfo -v "set-config:context=service;query-threads-limit=256"
asinfo -v "set-config:context=namespace;id=NAMESPACE;single-query-threads=8"

Calculate number of file descriptors

absctl backup may need to open many backup files and network sockets. If absctl backup cannot open the required number of file descriptors, it can fail with “too many open files” errors.

By default, absctl backup opens a new backup file for each of its --parallel threads. Each thread may have to open a network socket to each node in the cluster.

To approximate the maximum number of file descriptors needed for a backup:

  1. Set N to the value of --parallel.

  2. Set C to the number of nodes in the cluster.

  3. Estimate file descriptors as (N) (output files) + (N \times C) (network sockets) + a small constant for overhead (logs, DNS, etc).

This estimate is intentionally conservative. If you are backing up to cloud storage, increase the estimate to account for additional connections used for upload concurrency.

Feedback

Was this page helpful?

What type of feedback are you giving?

What would you like us to know?

+Capture screenshot

Can we reach out to you?