Skip to content

Run absctl backup

To run absctl backup and create backups from an Aerospike Database cluster, specify the following options:

  • --host: Cluster to back up.
  • --namespace: Namespace to back up. absctl backup backs up one namespace at a time.
  • --directory: Local directory for the backup files.

Basic backup example

A cluster contains a node with IP address 1.2.3.4.

To back up the test namespace on this cluster to the directory backup_2024_08_24, run the following command:

Terminal window
absctl backup --host 1.2.3.4 --namespace test --directory backup_2024_08_24

Data is stored in multiple files with the .asb file extension. By default, each backup file is limited to 250 MiB. When this limit is reached, absctl backup creates a new file. For the exact backup file format, see the file format specification at the Backup File Format section of the absctl GitHub repository.

Terminal window
absctl backup --host 1.2.3.4 --namespace test --directory ./output_dir --compress zstd

In this example, absctl backup writes data from the test namespace as multiple files to the output_dir directory, compressing them using the Zstandard (zstd) algorithm.

Back up to a single file

You can back up the cluster to a single file rather than a directory:

Terminal window
absctl backup --host HOST --namespace NAME --output-file FILENAME

Writes metadata (UDFs, indexes) followed by records to a single file. Use - in place of FILENAME to write to stdout.

Before writing to a single file, absctl backup runs an estimate and uses it to calculate an upper bound on the output size. Control the sampling with --estimate-samples (default: 10000).

Estimate backup size

You can run an estimate of the backup size without creating a backup file. Use this for capacity planning.

Terminal window
absctl backup --host HOST --namespace NAME --estimate
OptionDescription
--estimateRun estimate only (cannot run with —output-file or —directory)
--estimate-samples NNumber of records to sample (default: 10000)

This scans up to N records, measures their encoded sizes, multiplies average size by total record count, and adjusts for compression. The result is an approximation, not a guaranteed upper bound.

You cannot run an estimate with filters (—filter-exp, —partition-list, —node-list, —modified-after, and so on).

Connection options

The following options are available when specifying a cluster for backup:

OptionDefaultDescription
-h HOST or --host HOST127.0.0.1Host that acts as the entry point to the cluster. Any node in the cluster can be specified. The remaining nodes are discovered automatically.
-p PORT or --port PORT3000Port to connect to.
-U USER or --user USER-User name with read permission. Mandatory if the server has security enabled.
-P PASSWORD or --password-Password to authenticate the given user. The first form passes the password on the command line. The second form prompts for the password.
-A or --authINTERNALSet authentication mode when user and password are defined. Modes are (INTERNAL, EXTERNAL, PKI) and the default is INTERNAL. This mode must be set EXTERNAL when using LDAP.
-l or --node-list ADDR1[:TLS_NAME1]:PORT1,...-Back up partitions owned by specific nodes. See Back up by node.
--rack-list RACKID1,...-Back up partitions owned by specific racks. See Back up by rack.
-w N or --parallel N1Maximum number of scans to run in parallel. If you set --parallel 4, the tool runs 4 scans at the same time (not one-by-one as in the legacy tool).
--tls-enabledisabledIndicates a TLS connection should be used.
--services-alternatefalseSet to true to connect to Aerospike node’s alternate-access-address.
--prefer-racks RACKID1,...disabledComma separated list of rack IDs to prefer when reading records for a backup. Use to limit cross-datacenter network traffic.
--client-timeout MS30000Client timeout in milliseconds.
--client-idle-timeout MS0Client idle timeout in milliseconds.
--client-login-timeout MS10000Client login timeout in milliseconds.

Timeout options

The following parameters are available to specify between retries during data backup:

OptionDefaultDescription
--socket-timeout MS10000Socket timeout in milliseconds. If this value is 0, it is set to total-timeout. If both are 0, there is no socket idle time limit.
--total-timeout MS0Total socket timeout in milliseconds. Default is 0, that is, no timeout.
--max-retries N5Maximum number of retries before aborting the current transaction.
--sleep-between-retries MS5Amount of time to sleep between retries.
--info-timeout MS10000Timeout for info commands in milliseconds.
--info-retry-interval MS1000Interval between info command retries in milliseconds.
--info-retry-multiplier N1Multiplier for info command retries.
--info-max-retries N3Maximum number of retries for info commands.

TLS options

The following security options are available for authentication. There are no default values for these options.

OptionDefaultDescription
--tls-cafile=TLS_CAFILE-Path to a trusted CA certificate file.
--tls-capath=TLS_CAPATH-Path to a directory of trusted CA certificates.
--tls-name=TLS_NAME-Default TLS name used to authenticate each TLS socket connection. This must match the cluster name.
--tls-protocols=TLS_PROTOCOLS-Sets the TLS protocol selection criteria using the same format as Apache’s SSL Protocol. Default is +TLSv1.2 if unspecified.
--tls-keyfile=TLS_KEYFILE-Path to the key for mutual authentication if the Aerospike cluster supports it.
--tls-keyfile-password=TLS_KEYFILE_PASSWORD-Password to load a protected TLS keyfile. You can provide one of the following:
1) Environment variable: env:VAR
2) File: file:PATH
3) String: PASSWORD
If you specify --tls-keyfile-password without a value, the tool prompts you on the command line.
--tls-certfile=TLS_CERTFILE <path>-Path to the chain file for mutual authentication if the Aerospike cluster supports it.

TLS_NAME is only used when connecting with a secure TLS enabled server.

TLS backup example

The following example creates a backup with the following parameters:

  • Cluster node 1.2.3.4
  • Port 3000
  • Namespace test
  • Output directory backup_2024_08_24
  • TLS enabled
Terminal window
absctl backup --host 1.2.3.4 --port 3000 --namespace test --directory backup_2024_08_24 --tls-enable --tls-name cluster_name --tls-cafile /cluster_name.pem --tls-protocols +TLSv1.2 --tls-keyfile /cluster_name.key --tls-certfile /cluster_name.pem

Output options

The following options control the output files and directories that absctl backup creates:

OptionDefaultDescription
-d PATH or --directory PATH-Directory where you want to store .asb backup files. If the directory does not exist, it is created during the backup. This option is mandatory unless --output-file or --estimate is provided.
-o PATH or --output-file PATH-Single file to write the backup to. - means stdout. This option is mandatory unless --directory or --estimate is provided.
-q DESIRED-PREFIX or --output-file-prefix DESIRED-PREFIXOptional prefix prepended to output filenames. Can only be used with --directory (not with --output-file).
-e or --estimate-Specified instead of --directory or --output-file, estimates the average size of a single record in the backup file. Use this option to estimate the expected size of a backup before actually starting it. Multiply the returned value by the number of records in the namespace and add 10% for overhead. This option is mutually exclusive with --remove-artifacts and --continue.
--estimate-samples N10000Sets the number of record samples to take in a backup estimate and the number of estimate samples taken for the estimate run before backup-to-file.
-F LIMIT or --file-limit LIMIT250File size limit (in MiB) for --directory. If an .asb backup file crosses this size threshold, absctl backup switches to a new file.
-r or --remove-files-Clears the given --directory or removes an existing --output-file. By default, absctl backup does not write to a non-empty directory or overwrite an existing backup file. Mutually exclusive with --continue.
--remove-artifacts-Clears directory or removes output file, like --remove-files, without running a backup. Mutually exclusive with --continue and --estimate.

Compression and encryption options

OptionDefaultDescription
-z COMPRESSION-ALG or --compress COMPRESSION-ALGNONECompression algorithm used on backup files as they are written. zstd is the only available compression algorithm.
--compression-level N3zstd compression level setting. See the zstd manual for more information.
--encrypt ENCRYPTION-ALGNONEEncryption algorithm used on backup files as they are written. The options available are aes128 and aes256. Must be accompanied by either --encryption-key-file, --encryption-key-env, or --encryption-key-secret.

Logging options

OptionDefaultDescription
--log-level LEVELdebugControl log verbosity. Valid values: debug, info, warn, error.
--log-jsonfalseEmit JSON-formatted logs.
-v or --verbosefalseEnable verbose output.

Performance and throughput options

OptionDefaultDescription
-N BANDWIDTH or --bandwidth BANDWIDTH0Throttles absctl backup’s write operations so as not to exceed the given bandwidth in MiB/s.
--nice0Deprecated alias for --bandwidth.
-L or --records-per-second RPS0Limit total returned records per second (RPS). If RPS is zero (the default), no records-per-second limit is applied.
--scan-page-size N10000Number of records to retrieve per scan page (used only for continuation/state mode: --state-file-dst / --continue).
--std-buffer SIZE4Buffer size for stdout in MiB.
--local-buffer-size SIZE5Buffer size for local files in MiB.
-C or --compactfalseDo not Base64 encode BLOB values. This option is deprecated.

Specify incremental backup

Use the argument YYYY-MM-DD_HH:MM:SS as the time stamp variable to specify how absctl backup creates incremental backups:

  • -a or --modified-after YYYY-MM-DD_HH:MM:SS backs up keys time-stamped after the argument.
  • -b or --modified-before YYYY-MM-DD_HH:MM:SS backs up keys time-stamped before the argument.

You can also back up partitions to create incremental backups. See Back up by partition.

Namespace data selection options

The following options are available to specify the target namespace:

OptionDefaultDescription
-n NAMESPACE or --namespace NAMESPACE-Mandatory namespace to back up.
-s SETS or --set-list SETSAll setsSpecific set or comma-separated list of sets to back up. Multi-set backup cannot be used with --filter-exp.
-B BIN1,BIN2,... or --bin-list BIN1,BIN2,...All binsSpecific bin or comma-separated list of bins to back up.
-x or --no-bins-Only back up record metadata (digest, TTL, generation count, key).

No data (bin contents) is backed up. This is not meant for restoration, only testing. This command is unrelated to the legacy single-bin option in the Aerospike Database configuration file for Database versions 6.4.0 and earlier.
-R or --no-records-Do not back up any record data (metadata or bin data). By default, absctl backup includes record data, secondary index definitions, and UDF modules.
-I or --no-indexes-Do not back up any secondary index definitions.
--no-udfs-Do not back up any UDF modules.
-M or --max-records N0 = all recordsAn approximate limit for the number of records to process. Mutually exclusive with --partition-list and --after-digest.
-a YYYY-MM-DD_HH:MM:SS or --modified-after YYYY-MM-DD_HH:MM:SS-Back up data with last update time (LUT) after the specified date-time. The system’s local timezone applies.
-b YYYY-MM-DD_HH:MM:SS or --modified-before YYYY-MM-DD_HH:MM:SS-Back up data with last update time (LUT) prior to the specified date-time. The system’s local timezone applies.
--no-ttl-only-Include only records that have no TTL; that is, persistent records.

Use compression and encryption during backup

You can compress and encrypt backup file data before it is written to the backup file with --compress and --encrypt. Enable an option by passing it to absctl backup and include your chosen algorithm.

Compression

ZSTD, from the Facebook libzstd repository on GitHub, is the only compression algorithm available for absctl backup.

For example:

Terminal window
absctl backup --host HOST --namespace NAME --compress zstd --compression-level 3

The compression level, set with the optional --compression-level flag, is an integer described in the zstd manual. Set the default compression level with the ZSTD_CLEVEL_DEFAULT parameter.

Encryption

There are two available encryption algorithms:

AlgorithmDescription
aes128AES 128-bit key-digest encryption, which uses the CTR128 algorithm to encrypt data. The SHA256 hash of the encryption key generates the key used by CTR128.
aes256AES 256-bit key-digest encryption, which uses a 256-bit digest of the key for encryption and AES256 as the base encryption algorithm.

For encryption, you must provide a private key. The private encryption key may be in PEM format (with --encryption-key-file), a Base64 encoded key passed in through an environment variable (with --encryption-key-env), or retrieved from the Aerospike Secret Agent (with --encryption-key-secret).

For example, using an encryption key file:

Terminal window
absctl backup --host HOST --namespace NAME --encrypt aes128 --encryption-key-file KEY.PEM

Using an environment variable:

Terminal window
export PRIVATE_KEY='PRIVATE KEY'
absctl backup --host HOST --namespace NAME --encrypt aes256 --encryption-key-env PRIVATE_KEY

Replace PRIVATE_KEY with the contents of your private key file between the header and footer. In the following example the key starts with b3Blb and ends with eNfNpA=:

-----BEGIN OPENSSH PRIVATE KEY-----
b3BlbnNzaC1rZXktdjEAAAAACmFlczI1Ni1jdHIAAAAGYmNyeXB0AAAAGAAAABDWTq8LwB
zXg7xnGj4VNY3GAAAAEAAAAAEAAAAzAAAAC3NzaC1lZDI1NTE5AAAAIHuu8YsX03XGjJ1L
YFbehI4Ha7g8EVybKB3dAAPt/iFq3u9eNfNpA=
-----END OPENSSH PRIVATE KEY-----

Back up a subset of the cluster

By default, absctl backup backs up the entire namespace across all nodes. You can limit the backup to a specific subset of the cluster by specifying nodes, racks, or partitions.

These three options are mutually exclusive. You can only use one at a time:

OptionDescription
--node-listBack up partitions owned by specific nodes.
--rack-listBack up partitions owned by specific racks.
--partition-listBack up specific partitions or partition ranges.

Back up by node

Use --node-list to back up data from specific nodes on a partition basis. This calculates the partitions owned by the listed nodes and backs up only those partitions.

Terminal window
absctl backup --host HOST --namespace NAME --directory BACKUP_DIR --node-list NODE1:PORT,NODE2:PORT

PORT is the Aerospike service port (default 3000). To get the correct node address, use the asinfo command service-tls-std if the database is configured to use TLS, or service-clear-std if no TLS is configured.

The --node-list flag is useful when running multiple absctl backup processes in parallel, for example one per Aerospike node.

This option is mutually exclusive with --rack-list, --partition-list, and --after-digest.

Back up by rack

Use --rack-list to back up data from specific racks. This calculates the partitions owned by nodes in the listed racks and backs up only those partitions.

Terminal window
absctl backup --host HOST --namespace NAME --directory BACKUP_DIR --rack-list 1,2

This example backs up all partitions owned by nodes in racks 1 and 2.

This option is mutually exclusive with --node-list, --partition-list, --prefer-racks, and --after-digest.

Back up by partition

Use --partition-list to back up specific partitions or ranges of partitions.

Terminal window
absctl backup --host HOST --namespace NAME --directory BACKUP_DIR --partition-list 0-1000

This example backs up partitions 0 through 999 (1000 partitions starting from 0).

Default number of partitions: 0 to 4095 (all partitions).

Partition filter format

  • LIST format: FILTER1,FILTER2,...
  • FILTER can be one of:
    • Range: BEGIN-COUNT — Back up COUNT partitions starting from BEGIN (0-4095).
    • Single partition: PARTITION_ID — Back up a single partition.
    • Digest cursor: DIGEST — Back up records after a specific digest in its partition, in digest order.
Filter typeFormatExampleDescription
RangeBEGIN-COUNT0-1000Partitions 0–999
SingleID2222Partition 2222 only
Non-contiguousID1,ID2,ID3100,500,2000Partitions 100, 500, and 2000
Digest cursorBASE64_DIGESTVSmeSvxNRqr46NbOqiy9gy5LTIc=Records after this digest in its partition

absctl supports selecting arbitrary non-contiguous partition IDs in a single backup. For example, -X 100,500,2000,3500 backs up only those four specific partitions without scanning the partitions in between. This is useful for targeted backups or distributing partitions across multiple backup jobs.

When using multiple partition filters, each filter is a single scan call and cannot be parallelized with the --parallel option. For more parallelism, break up the partition filters or run a backup using only one partition filter.

When backing up only a single partition range, the range is automatically divided into --parallel segments of near-equal size, each of which is backed up in parallel.

Digest-based filtering

Digest filters provide cursor-like functionality for partition-level pagination. When you specify a digest, the backup starts from records after that digest (in digest order) within the digest’s partition.

This is useful for:

  • Resuming a backup from a specific record
  • Paginating through large partitions
  • Creating incremental backups based on record position

The digest is a Base64-encoded string that uniquely identifies a record. You can find digests in backup files or extract them using the Aerospike client.

Partition filter examples

-X 361

  • Back up only partition 361.

-X 361,529,841

  • Back up partitions 361, 529, and 841 (non-contiguous selection).

-X 361-10

  • Back up 10 partitions, starting with 361 and ending with 370.

-X VSmeSvxNRqr46NbOqiy9gy5LTIc=

  • Back up all records after the digest VSmeSvxNRqr46NbOqiy9gy5LTIc= in its partition (partition 2389 in this case).
  • Records are returned in digest order, starting after the specified digest.

-X 0-1000,2222,EjRWeJq83vEjRRI0VniavN7xI0U=

  • Back up partitions 0 to 999 (1000 partitions starting from 0).
  • Then back up partition 2222.
  • Then back up all records after the digest EjRWeJq83vEjRRI0VniavN7xI0U= in its partition.

This option is mutually exclusive with --node-list, --rack-list, --after-digest, and --max-records.

After specific digest

Use --after-digest to resume a backup from a specific record. This backs up all records after the specified digest in its partition, plus all records in all succeeding partitions (through partition 4095).

Terminal window
absctl backup --host HOST --namespace NAME --directory BACKUP_DIR --after-digest EjRWeJq83vEjRRI0VniavN7xI0U=
  • DIGEST format: Base64-encoded string of the record digest. This is the same encoding used in backup files, so you can copy digests directly from backup output.

How it works

If the specified digest belongs to partition 1000, --after-digest will:

  1. Scan partition 1000 starting after the specified digest (in digest order)
  2. Scan all partitions from 1001 through 4095 completely

This is different from using a digest in --partition-list, which only scans records after the digest within that single partition.

Use cases

  • Manual backup resumption: If a backup was interrupted and you know the last record backed up, use its digest to resume.
  • Incremental partition-based backups: Back up the remainder of the partition space from a known position.

This option is mutually exclusive with --partition-list, --node-list, --rack-list, and --max-records.

Filter expression

Backups can be made of only a subset of data matching a provided Aerospike Expression. You must provide the Base64 encoding of the filter expression, which you can generate with the Aerospike client libraries in different languages.

This option is mutually exclusive with multi-set backup, which is triggered by passing --set with more than one set specified.

To build an expression that filters for bin "name" = "bob", first, build the expression in a client and print out its Base64 encoding:

package main
import (
"fmt"
a "github.com/aerospike/aerospike-client-go/v8"
)
func main() {
exp := a.ExpEq(a.ExpStringBin("name"), a.ExpStringVal("bob"))
fmt.Println(exp.Base64())
}

This should print kwGTUQOkbmFtZaQDYm9i. Then, to run a backup with this filter expression, run:

Terminal window
absctl backup --filter-exp kwGTUQOkbmFtZaQDYm9i ...

Backup resumption

To make a backup resumable, run a directory backup with --file-limit and --state-file-dst. When a backup file rotates (reaches --file-limit), absctl backup updates the state file with the current scan position so the backup can be resumed later.

  • If the backup completes successfully, the state file is removed.
  • If the backup is interrupted or fails, the state file is left behind and can be used with --continue.

When resuming with --continue, all command line arguments (except --remove-files) should match those used in the original run.

OptionDefaultDescription
--continue STATE-FILEdisabledEnables the resumption of an interrupted backup from provided state file. All other command line arguments should match those used in the initial run (except --remove-files, which is mutually exclusive with --continue).
--state-file-dst NAME-Writes a state file (checkpoint) named NAME into the backup destination directory so the backup can be resumed with --continue.

Requirements and constraints for state/continuation mode:

  • Directory backups only (--directory).
  • Requires --file-limit to be non-zero because state is saved on file rotation.
  • Requires --scan-page-size to be non-zero. The default is 10000.
  • --state-file-dst is mutually exclusive with --continue (you either create a new state file or resume from an existing one).
  • --continue is mutually exclusive with --remove-files.
  • Not supported with --node-list or --rack-list.

Throttle data backup

If absctl backup can retrieve data from the database faster than it can write data, you may need to throttle the retrieval rate. Use the --bandwidth RATE flag to restrict the rate at which data is written. The rate is specified in MiB/s.

Write to stdout and piping

Pass - to --output-file to write backup data to stdout. This is useful for pipes.

The following example writes backup data to stdout and pipes the output to gzip to create a compressed file:

Terminal window
absctl backup --host HOST --namespace NAME --output-file - | gzip > FILENAME.GZ

The gzip utility is single-threaded. Using gzip can cause single-CPU core saturation and create a bottleneck. To take advantage of multi-core archive utilities, consider using xz instead.

You can use the --compress runtime option to compress backup data. See Use compression and encryption during backup for more information.

Configure absctl backup with configuration files

You can configure absctl backup using a configuration file. The tool uses YAML format for configuration.

The following options control configuration file behavior:

OptionDefaultDescription
--config PATH-Read configuration from the specified YAML file.

Legacy astools.conf INI-style configuration is not supported. Use YAML with --config instead.

Feedback

Was this page helpful?

What type of feedback are you giving?

What would you like us to know?

+Capture screenshot

Can we reach out to you?