Skip to main content
Loading

Backup and Restore Resource Usage

caution

asrestore and asbackup can use significant resources. For best results, run them on a separate host or container from Aerospike database.

asrestore and asbackup use resources depending on the options they run with. See the explanations and formulas below for how they use memory, storage, and file descriptors. The command line options used in the following formulas are defined in the asrestore usage and asbackup usage pages.

note

The information outlined here applies toasbackup and asrestore 3.11.0 and later.

These formulas are intended to describe how resource usage scales, not give exact answers. Plan to allocate slightly more resources to these tools than the formulas suggest.

Memory usageโ€‹

Memory usage for asrestore and asbackup scales most significantly with the following supplied arguments, but usage is also affected by the system environment and version of the tool.

asbackupโ€‹

asbackup's memory usage is largely affected by --parallel. By default, asbackup distributes work across threads by assigning each one a unique file to write to. These files each have a constant size buffer associated with them.

The buffer in asbackup is 4KiB.

Formula to calculate approximate memory usage for asbackup
--parallel x 4KiB

If you are backing up to Amazon S3, more memory will be used to maintain the internal s3 client and buffers.

asrestoreโ€‹

asrestore's memory usage scales most closely with --parallel, --batch-size, --max-async-batches, and the average record size in the backup being restored.

Formula to calculate approximate memory usage for asrestore
(--parallel x --batch-size x avg_record_size)
+ (--batch-size x --max-async-batches)
+ max_async_batches

This formula still holds even if --disable-batch-writes is set. In this case, all that changes with respect to the formula is the default --batch-size value.

Estimating disk space for a backupโ€‹

For an estimate, use the --estimate option of asbackup. As shown in the following example, this option reads 10,000 record from the specified namespace and prints the average size of the sampled records:

asbackup --namespace NAME --estimate

Multiply the displayed estimated record size by the number of records in the namespace, and add 10% of the result for overhead and indexes:

Formula to calculate approximate disk space for backup
Estimated average record size from asbackup --estimate
ร— Number of records in namespace
+ 10% of estimated record size
= approximate disk space needed for backup

File descriptorsโ€‹

Both asbackup and asrestore may need to open many backup files and network sockets. It is important to allow the asbackup and asrestore processes to open the proper number of file descriptors to prevent too many open file errors. Here are some ways to estimate the required file descriptors.

asbackupโ€‹

By default, asbackup opens a new backup file for each of its --parallel threads. Each thread may have to open a network socket to each node in the cluster. This means the worst case formula for open file descriptors is as follows.

Formula to calculate approximate file descriptors for backup
(--parallel x nodes_in_cluster) + backup_file_count

Where backup_file_count is equal to --parallel when using --directory.

Asrestoreโ€‹

By default, asrestore opens a backup file for each of its --parallel threads. If asrestore is running against Database 6.0 or later, it uploads records in batches. In this case, asrestore opens up to --max-async-batches sockets to the server for record upload.

If the server version is older than 6.0 or the --disable-batch-writes flag is used, each record is uploaded individually, so up to --batch-size x --max-async-batches sockets may be opened.

Formula to calculate approximate file descriptors for asrestore with batch writes
--parallel + --max_async_batches
Formula to calculate approximate file descriptors for asrestore without batch writes
--parallel + (--max_async_batches x --batch_size)