Run absctl backup

To run absctl backup and create backups from an Aerospike Database cluster, specify the following options:

--host: Cluster to back up.
--namespace: Namespace to back up. absctl backup backs up one namespace at a time.
--directory: Local directory for the backup files.

Basic backup example

A cluster contains a node with IP address 1.2.3.4.

To back up the test namespace on this cluster to the directory backup_2024_08_24, run the following command:

absctl backup --host 1.2.3.4 --namespace test --directory backup_2024_08_24

Data is stored in multiple files with the .asb file extension. By default, each backup file is limited to 250 MiB. When this limit is reached, absctl backup creates a new file. For the exact backup file format, see the file format specification at the Backup File Format section of the absctl GitHub repository.

absctl backup --host 1.2.3.4 --namespace test --directory ./output_dir --compress zstd

In this example, absctl backup writes data from the test namespace as multiple files to the output_dir directory, compressing them using the Zstandard (zstd) algorithm.

Back up to a single file

You can back up the cluster to a single file rather than a directory:

absctl backup --host HOST --namespace NAME --output-file FILENAME

Writes metadata (UDFs, indexes) followed by records to a single file. Use - in place of FILENAME to write to stdout.

Before writing to a single file, absctl backup runs an estimate and uses it to calculate an upper bound on the output size. Control the sampling with --estimate-samples (default: 10000).

Estimate backup size

You can run an estimate of the backup size without creating a backup file. Use this for capacity planning.

absctl backup --host HOST --namespace NAME --estimate

Option	Description
`--estimate`	Run estimate only (cannot run with —output-file or —directory)
`--estimate-samples N`	Number of records to sample (default: 10000)

This scans up to N records, measures their encoded sizes, multiplies average size by total record count, and adjusts for compression. The result is an approximation, not a guaranteed upper bound.

You cannot run an estimate with filters (—filter-exp, —partition-list, —node-list, —modified-after, and so on).

Connection options

The following options are available when specifying a cluster for backup:

Option	Default	Description
`-h HOST` or `--host HOST`	127.0.0.1	Host that acts as the entry point to the cluster. Any node in the cluster can be specified. The remaining nodes are discovered automatically.
`-p PORT` or `--port PORT`	3000	Port to connect to.
`-U USER` or `--user USER`	-	User name with read permission. Mandatory if the server has security enabled.
`-P PASSWORD` or `--password`	-	Password to authenticate the given user. The first form passes the password on the command line. The second form prompts for the password.
`-A` or `--auth`	INTERNAL	Set authentication mode when user and password are defined. Modes are (INTERNAL, EXTERNAL, PKI) and the default is INTERNAL. This mode must be set EXTERNAL when using LDAP.
`-l` or `--node-list ADDR1[:TLS_NAME1]:PORT1,...`	-	Back up partitions owned by specific nodes. See Back up by node.
`--rack-list RACKID1,...`	-	Back up partitions owned by specific racks. See Back up by rack.
`-w N` or `--parallel N`	1	Maximum number of scans to run in parallel. If you set `--parallel 4`, the tool runs 4 scans at the same time (not one-by-one as in the legacy tool).
`--tls-enable`	disabled	Indicates a TLS connection should be used.
`--services-alternate`	false	Set to `true` to connect to Aerospike node’s `alternate-access-address`.
`--prefer-racks RACKID1,...`	disabled	Comma separated list of rack IDs to prefer when reading records for a backup. Use to limit cross-datacenter network traffic.
`--client-timeout MS`	30000	Client timeout in milliseconds.
`--client-idle-timeout MS`	0	Client idle timeout in milliseconds.
`--client-login-timeout MS`	10000	Client login timeout in milliseconds.

Timeout options

The following parameters are available to specify between retries during data backup:

Option	Default	Description
`--socket-timeout MS`	10000	Socket timeout in milliseconds. If this value is 0, it is set to total-timeout. If both are 0, there is no socket idle time limit.
`--total-timeout MS`	0	Total socket timeout in milliseconds. Default is 0, that is, no timeout.
`--max-retries N`	5	Maximum number of retries before aborting the current transaction.
`--sleep-between-retries MS`	5	Amount of time to sleep between retries.
`--info-timeout MS`	10000	Timeout for info commands in milliseconds.
`--info-retry-interval MS`	1000	Interval between info command retries in milliseconds.
`--info-retry-multiplier N`	1	Multiplier for info command retries.
`--info-max-retries N`	3	Maximum number of retries for info commands.

TLS options

The following security options are available for authentication. There are no default values for these options.

Option	Default	Description
`--tls-cafile=TLS_CAFILE`	-	Path to a trusted CA certificate file.
`--tls-capath=TLS_CAPATH`	-	Path to a directory of trusted CA certificates.
`--tls-name=TLS_NAME`	-	Default TLS name used to authenticate each TLS socket connection. This must match the cluster name.
`--tls-protocols=TLS_PROTOCOLS`	-	Sets the TLS protocol selection criteria using the same format as Apache’s SSL Protocol. Default is `+TLSv1.2` if unspecified.
`--tls-keyfile=TLS_KEYFILE`	-	Path to the key for mutual authentication if the Aerospike cluster supports it.
`--tls-keyfile-password=TLS_KEYFILE_PASSWORD`	-	Password to load a protected TLS keyfile. You can provide one of the following: 1) Environment variable: `env:VAR` 2) File: `file:PATH` 3) String: `PASSWORD` If you specify `--tls-keyfile-password` without a value, the tool prompts you on the command line.
`--tls-certfile=TLS_CERTFILE <path>`	-	Path to the chain file for mutual authentication if the Aerospike cluster supports it.

TLS_NAME is only used when connecting with a secure TLS enabled server.

TLS backup example

The following example creates a backup with the following parameters:

Cluster node 1.2.3.4
Port 3000
Namespace test
Output directory backup_2024_08_24
TLS enabled

absctl backup --host 1.2.3.4 --port 3000 --namespace test --directory backup_2024_08_24 --tls-enable --tls-name cluster_name --tls-cafile /cluster_name.pem --tls-protocols +TLSv1.2 --tls-keyfile /cluster_name.key --tls-certfile /cluster_name.pem

Output options

The following options control the output files and directories that absctl backup creates:

Option	Default	Description
`-d PATH` or `--directory PATH`	-	Directory where you want to store `.asb` backup files. If the directory does not exist, it is created during the backup. This option is mandatory unless `--output-file` or `--estimate` is provided.
`-o PATH` or `--output-file PATH`	-	Single file to write the backup to. `-` means `stdout`. This option is mandatory unless `--directory` or `--estimate` is provided.
`-q DESIRED-PREFIX` or `--output-file-prefix DESIRED-PREFIX`		Optional prefix prepended to output filenames. Can only be used with `--directory` (not with `--output-file`).
`-e` or `--estimate`	-	Specified instead of `--directory` or `--output-file`, estimates the average size of a single record in the backup file. Use this option to estimate the expected size of a backup before actually starting it. Multiply the returned value by the number of records in the namespace and add 10% for overhead. This option is mutually exclusive with `--remove-artifacts` and `--continue`.
`--estimate-samples N`	10000	Sets the number of record samples to take in a backup estimate and the number of estimate samples taken for the estimate run before backup-to-file.
`-F LIMIT` or `--file-limit LIMIT`	250	File size limit (in MiB) for `--directory`. If an `.asb` backup file crosses this size threshold, `absctl backup` switches to a new file.
`-r` or `--remove-files`	-	Clears the given `--directory` or removes an existing `--output-file`. By default, `absctl backup` does not write to a non-empty directory or overwrite an existing backup file. Mutually exclusive with `--continue`.
`--remove-artifacts`	-	Clears directory or removes output file, like `--remove-files`, without running a backup. Mutually exclusive with `--continue` and `--estimate`.

Compression and encryption options

Option	Default	Description
`-z COMPRESSION-ALG` or `--compress COMPRESSION-ALG`	NONE	Compression algorithm used on backup files as they are written. `zstd` is the only available compression algorithm.
`--compression-level N`	3	zstd compression level setting. See the zstd manual for more information.
`--encrypt ENCRYPTION-ALG`	NONE	Encryption algorithm used on backup files as they are written. The options available are `aes128` and `aes256`. Must be accompanied by either `--encryption-key-file`, `--encryption-key-env`, or `--encryption-key-secret`.

Logging options

Option	Default	Description
`--log-level LEVEL`	`debug`	Control log verbosity. Valid values: `debug`, `info`, `warn`, `error`.
`--log-json`	`false`	Emit JSON-formatted logs.
`-v` or `--verbose`	`false`	Enable verbose output.

Performance and throughput options

Option	Default	Description
`-N BANDWIDTH` or `--bandwidth BANDWIDTH`	0	Throttles `absctl backup`’s write operations so as not to exceed the given bandwidth in MiB/s.
`--nice`	0	Deprecated alias for `--bandwidth`.
`-L` or `--records-per-second RPS`	0	Limit total returned records per second (RPS). If `RPS` is zero (the default), no records-per-second limit is applied.
`--scan-page-size N`	10000	Number of records to retrieve per scan page (used only for continuation/state mode: `--state-file-dst` / `--continue`).
`--std-buffer SIZE`	4	Buffer size for stdout in MiB.
`--local-buffer-size SIZE`	5	Buffer size for local files in MiB.
`-C` or `--compact`	false	Do not Base64 encode BLOB values. This option is deprecated.

Specify incremental backup

Use the argument YYYY-MM-DD_HH:MM:SS as the time stamp variable to specify how absctl backup creates incremental backups:

-a or --modified-after YYYY-MM-DD_HH:MM:SS backs up keys time-stamped after the argument.
-b or --modified-before YYYY-MM-DD_HH:MM:SS backs up keys time-stamped before the argument.

You can also back up partitions to create incremental backups. See Back up by partition.

Namespace data selection options

The following options are available to specify the target namespace:

Option	Default	Description
`-n NAMESPACE` or `--namespace NAMESPACE`	-	Mandatory namespace to back up.
`-s SETS` or `--set-list SETS`	All sets	Specific set or comma-separated list of sets to back up. Multi-set backup cannot be used with `--filter-exp`.
`-B BIN1,BIN2,...` or `--bin-list BIN1,BIN2,...`	All bins	Specific bin or comma-separated list of bins to back up.
`-x` or `--no-bins`	-	Only back up record metadata (digest, TTL, generation count, key). No data (bin contents) is backed up. This is not meant for restoration, only testing. This command is unrelated to the legacy single-bin option in the Aerospike Database configuration file for Database versions 6.4.0 and earlier.
`-R` or `--no-records`	-	Do not back up any record data (metadata or bin data). By default, `absctl backup` includes record data, secondary index definitions, and UDF modules.
`-I` or `--no-indexes`	-	Do not back up any secondary index definitions.
`--no-udfs`	-	Do not back up any UDF modules.
`-M` or `--max-records N`	0 = all records	An approximate limit for the number of records to process. Mutually exclusive with `--partition-list` and `--after-digest`.
`-a YYYY-MM-DD_HH:MM:SS` or `--modified-after YYYY-MM-DD_HH:MM:SS`	-	Back up data with last update time (LUT) after the specified date-time. The system’s local timezone applies.
`-b YYYY-MM-DD_HH:MM:SS` or `--modified-before YYYY-MM-DD_HH:MM:SS`	-	Back up data with last update time (LUT) prior to the specified date-time. The system’s local timezone applies.
`--no-ttl-only`	-	Include only records that have no TTL; that is, persistent records.

Use compression and encryption during backup

You can compress and encrypt backup file data before it is written to the backup file with --compress and --encrypt. Enable an option by passing it to absctl backup and include your chosen algorithm.

Compression

ZSTD, from the Facebook libzstd repository on GitHub, is the only compression algorithm available for absctl backup.

For example:

absctl backup --host HOST --namespace NAME --compress zstd --compression-level 3

The compression level, set with the optional --compression-level flag, is an integer described in the zstd manual. Set the default compression level with the ZSTD_CLEVEL_DEFAULT parameter.

Encryption

There are two available encryption algorithms:

Algorithm	Description
aes128	AES 128-bit key-digest encryption, which uses the CTR128 algorithm to encrypt data. The SHA256 hash of the encryption key generates the key used by CTR128.
aes256	AES 256-bit key-digest encryption, which uses a 256-bit digest of the key for encryption and AES256 as the base encryption algorithm.

For encryption, you must provide a private key. The private encryption key may be in PEM format (with --encryption-key-file), a Base64 encoded key passed in through an environment variable (with --encryption-key-env), or retrieved from the Aerospike Secret Agent (with --encryption-key-secret).

For example, using an encryption key file:

absctl backup --host HOST --namespace NAME --encrypt aes128 --encryption-key-file KEY.PEM

Using an environment variable:

export PRIVATE_KEY='PRIVATE KEY'
absctl backup --host HOST --namespace NAME --encrypt aes256 --encryption-key-env PRIVATE_KEY

Replace PRIVATE_KEY with the contents of your private key file between the header and footer. In the following example the key starts with b3Blb and ends with eNfNpA=:

Back up a subset of the cluster

By default, absctl backup backs up the entire namespace across all nodes. You can limit the backup to a specific subset of the cluster by specifying nodes, racks, or partitions.

These three options are mutually exclusive. You can only use one at a time:

Option	Description
`--node-list`	Back up partitions owned by specific nodes.
`--rack-list`	Back up partitions owned by specific racks.
`--partition-list`	Back up specific partitions or partition ranges.

Back up by node

Use --node-list to back up data from specific nodes on a partition basis. This calculates the partitions owned by the listed nodes and backs up only those partitions.

absctl backup --host HOST --namespace NAME --directory BACKUP_DIR --node-list NODE1:PORT,NODE2:PORT

PORT is the Aerospike service port (default 3000). To get the correct node address, use the asinfo command service-tls-std if the database is configured to use TLS, or service-clear-std if no TLS is configured.

The --node-list flag is useful when running multiple absctl backup processes in parallel, for example one per Aerospike node.

This option is mutually exclusive with --rack-list, --partition-list, and --after-digest.

Back up by rack

Use --rack-list to back up data from specific racks. This calculates the partitions owned by nodes in the listed racks and backs up only those partitions.

absctl backup --host HOST --namespace NAME --directory BACKUP_DIR --rack-list 1,2

This example backs up all partitions owned by nodes in racks 1 and 2.

This option is mutually exclusive with --node-list, --partition-list, --prefer-racks, and --after-digest.

Back up by partition

Use --partition-list to back up specific partitions or ranges of partitions.

absctl backup --host HOST --namespace NAME --directory BACKUP_DIR --partition-list 0-1000

This example backs up partitions 0 through 999 (1000 partitions starting from 0).

Default number of partitions: 0 to 4095 (all partitions).

Partition filter format

LIST format: FILTER1,FILTER2,...
FILTER can be one of:
- Range: BEGIN-COUNT — Back up COUNT partitions starting from BEGIN (0-4095).
- Single partition: PARTITION_ID — Back up a single partition.
- Digest cursor: DIGEST — Back up records after a specific digest in its partition, in digest order.

Filter type	Format	Example	Description
Range	`BEGIN-COUNT`	`0-1000`	Partitions 0–999
Single	`ID`	`2222`	Partition 2222 only
Non-contiguous	`ID1,ID2,ID3`	`100,500,2000`	Partitions 100, 500, and 2000
Digest cursor	`BASE64_DIGEST`	`VSmeSvxNRqr46NbOqiy9gy5LTIc=`	Records after this digest in its partition

absctl supports selecting arbitrary non-contiguous partition IDs in a single backup. For example, -X 100,500,2000,3500 backs up only those four specific partitions without scanning the partitions in between. This is useful for targeted backups or distributing partitions across multiple backup jobs.

When using multiple partition filters, each filter is a single scan call and cannot be parallelized with the --parallel option. For more parallelism, break up the partition filters or run a backup using only one partition filter.

When backing up only a single partition range, the range is automatically divided into --parallel segments of near-equal size, each of which is backed up in parallel.

Digest-based filtering

Digest filters provide cursor-like functionality for partition-level pagination. When you specify a digest, the backup starts from records after that digest (in digest order) within the digest’s partition.

This is useful for:

Resuming a backup from a specific record
Paginating through large partitions
Creating incremental backups based on record position

The digest is a Base64-encoded string that uniquely identifies a record. You can find digests in backup files or extract them using the Aerospike client.

Partition filter examples

-X 361

Back up only partition 361.

-X 361,529,841

Back up partitions 361, 529, and 841 (non-contiguous selection).

-X 361-10

Back up 10 partitions, starting with 361 and ending with 370.

-X VSmeSvxNRqr46NbOqiy9gy5LTIc=

Back up all records after the digest VSmeSvxNRqr46NbOqiy9gy5LTIc= in its partition (partition 2389 in this case).
Records are returned in digest order, starting after the specified digest.

-X 0-1000,2222,EjRWeJq83vEjRRI0VniavN7xI0U=

Back up partitions 0 to 999 (1000 partitions starting from 0).
Then back up partition 2222.
Then back up all records after the digest EjRWeJq83vEjRRI0VniavN7xI0U= in its partition.

This option is mutually exclusive with --node-list, --rack-list, --after-digest, and --max-records.

After specific digest

Use --after-digest to resume a backup from a specific record. This backs up all records after the specified digest in its partition, plus all records in all succeeding partitions (through partition 4095).

absctl backup --host HOST --namespace NAME --directory BACKUP_DIR --after-digest EjRWeJq83vEjRRI0VniavN7xI0U=

DIGEST format: Base64-encoded string of the record digest. This is the same encoding used in backup files, so you can copy digests directly from backup output.

How it works

If the specified digest belongs to partition 1000, --after-digest will:

Scan partition 1000 starting after the specified digest (in digest order)
Scan all partitions from 1001 through 4095 completely

This is different from using a digest in --partition-list, which only scans records after the digest within that single partition.

Use cases

Manual backup resumption: If a backup was interrupted and you know the last record backed up, use its digest to resume.
Incremental partition-based backups: Back up the remainder of the partition space from a known position.

This option is mutually exclusive with --partition-list, --node-list, --rack-list, and --max-records.

Filter expression

Backups can be made of only a subset of data matching a provided Aerospike Expression. You must provide the Base64 encoding of the filter expression, which you can generate with the Aerospike client libraries in different languages.

This option is mutually exclusive with multi-set backup, which is triggered by passing --set with more than one set specified.

To build an expression that filters for bin "name" = "bob", first, build the expression in a client and print out its Base64 encoding:

package main

import (
    "fmt"

    a "github.com/aerospike/aerospike-client-go/v8"
)

func main() {
    exp := a.ExpEq(a.ExpStringBin("name"), a.ExpStringVal("bob"))
    fmt.Println(exp.Base64())
}

import com.aerospike.client.exp.Exp;
import com.aerospike.client.exp.Expression;

public class FilterExpression {
    public static void main(String[] args) {
        Expression exp = Exp.build(Exp.eq(Exp.stringBin("name"), Exp.val("bob")));
        System.out.println(exp.getBase64());
    }
}

#include <aerospike/as_exp.h>
#include <stdio.h>

int main() {
    as_exp_build_b64(b64_exp, as_exp_cmp_eq(as_exp_bin_str("name"), as_exp_str("bob")));
    printf("%s\n", b64_exp);
    return 0;
}

This should print kwGTUQOkbmFtZaQDYm9i. Then, to run a backup with this filter expression, run:

absctl backup --filter-exp kwGTUQOkbmFtZaQDYm9i ...

Backup resumption

To make a backup resumable, run a directory backup with --file-limit and --state-file-dst. When a backup file rotates (reaches --file-limit), absctl backup updates the state file with the current scan position so the backup can be resumed later.

If the backup completes successfully, the state file is removed.
If the backup is interrupted or fails, the state file is left behind and can be used with --continue.

When resuming with --continue, all command line arguments (except --remove-files) should match those used in the original run.

Option	Default	Description
`--continue STATE-FILE`	disabled	Enables the resumption of an interrupted backup from provided state file. All other command line arguments should match those used in the initial run (except `--remove-files`, which is mutually exclusive with `--continue`).
`--state-file-dst NAME`	-	Writes a state file (checkpoint) named `NAME` into the backup destination directory so the backup can be resumed with `--continue`.

Requirements and constraints for state/continuation mode:

Directory backups only (--directory).
Requires --file-limit to be non-zero because state is saved on file rotation.
Requires --scan-page-size to be non-zero. The default is 10000.
--state-file-dst is mutually exclusive with --continue (you either create a new state file or resume from an existing one).
--continue is mutually exclusive with --remove-files.
Not supported with --node-list or --rack-list.

Throttle data backup

If absctl backup can retrieve data from the database faster than it can write data, you may need to throttle the retrieval rate. Use the --bandwidth RATE flag to restrict the rate at which data is written. The rate is specified in MiB/s.

Write to stdout and piping

Pass - to --output-file to write backup data to stdout. This is useful for pipes.

The following example writes backup data to stdout and pipes the output to gzip to create a compressed file:

absctl backup --host HOST --namespace NAME --output-file - | gzip > FILENAME.GZ

The gzip utility is single-threaded. Using gzip can cause single-CPU core saturation and create a bottleneck. To take advantage of multi-core archive utilities, consider using xz instead.

You can use the --compress runtime option to compress backup data. See Use compression and encryption during backup for more information.

Configure `absctl backup` with configuration files

You can configure absctl backup using a configuration file. The tool uses YAML format for configuration.

The following options control configuration file behavior:

Option	Default	Description
`--config PATH`	-	Read configuration from the specified YAML file.

Legacy astools.conf INI-style configuration is not supported. Use YAML with --config instead.