Skip to main content
Loading

Back up and restore AVS

Overviewโ€‹

This page describes how to use Aerospike's backup and restore tools for Aerospike Vector Search (AVS). AVS uses the Aerospike Database (ASDB) as a storage layer, so you can leverage the same tools and similar processes for backup and restore as you would in a standard ASDB deployment. Backing up indexes can save time and compute resources if you need to reuse a specific index in the future.

  • To back up and restore AVS, use Aerospike's backup and restore tools directly against the underlying ASDB cluster.

  • AVS uses metadata and storage namespaces. Both of these namespaces must be backed up and restored together to ensure a functional AVS deployment.

  • All index metadata must be backed up and restored in full, but you can back up individual indexes by specifying the namespace and set where the index and vector data are stored.

  • AVS must be shut down before performing a restore operation. Do not start AVS until the index metadata restore is finished to ensure it can properly discover the restored indexes.

caution

The metadata namespace stores information for all indexes. If you back up the metadata and a single index of an AVS deployment with multiple indexes, restoring to an empty AVS deployment writes metadata for indexes that do not exist in the deployment.

tip

Before starting the following procedures, you should be familiar with the AVS data model and underlying Aerospike Database configuration.

Backupโ€‹

AVS stores data across multiple Aerospike namespaces, an index metadata namespace and a storage namespace at the very least. These namespaces must be backed up and restored together.

note

All index metadata must be backed up and restored in full, but you can back up individual indexes by specifying the namespace and set where the index and vector data are stored.

Consider the case where AVS stores metadata in the default avs-meta namespace, and vector and index data in the test namespace. The following example backs up all AVS data in the avs-meta and test namespaces.

  1. Use asbackup to back up the AVS vector and index data into the vector_data directory.

    asbackup -n "test" -d vector_data -h AEROSPIKE_IP
  2. Use asbackup to back up the AVS index metadata from the Aerospike cluster to the index_meta directory.

    asbackup -n "avs-meta" -d index_meta -h AEROSPIKE_IP

Restoreโ€‹

caution

You must shut down AVS before starting any restore operations.

  1. Use asrestore to restore the index metadata.

    asrestore -d index_meta -h AEROSPIKE_IP
  2. Start AVS to begin discovery of the restored indexes. You can use AVS to search the restored indexes while vector and index data restoration is in progress (step 3).

    sudo systemctl start aerospike-vector-search
  3. Use asrestore to restore the vector and index data.

    asrestore -d vector_data -h AEROSPIKE_IP

Using separate vector and index namespacesโ€‹

You can use different storage backends and configurations when you store AVS vector and index data in individual namespaces.

  • AVS must be shut down during a restore until metadata is fully restored.

  • You must back up and restore the vector and index data separately when you use separate namespaces. The order of the backup and restore procedures does not affect correctness, but the order can be used to optimize uptime and resource usage.

  • You can search AVS after the metadata restore has finished and during index data restore.

  • AVS returns results after metadata is fully restored and index and vector data are partially restored. Index data is required to search, so it should be restored first or at the same time as vector data.

  • You can restore vector data without index data, or with out-of-date index data, because the AVS index healer rebuilds the index. However, this takes time and compute resources, and degrades vector search accuracy until it is complete, so you should restore both index and vector data when possible.

  • Because index data can be recovered from vector data by the AVS index healer, it is generally more important to keep vector data backups up to date. However, both vector and index data backups should be kept up to date and restored together when possible.

Example backup and restore flowโ€‹

The following illustrates a backup and restore flow for AVS deployments that store index and vector data in separate namespaces. In these examples, index metadata is stored in the avs-meta namespace, index data in index, and vector data in vector.

Backupโ€‹

  1. Use asbackup to back up the AVS index data into the index_data directory.

    asbackup -n "index" -d index_data -h AEROSPIKE_IP
  2. Use asbackup to back up the AVS vector data into the vector_data directory.

    asbackup -n "vector" -d vector_data -h AEROSPIKE_IP
  3. Use asbackup to back up the AVS index metadata from the Aerospike cluster to the index_meta directory.

    asbackup -n "avs-meta" -d index_meta -h AEROSPIKE_IP

Restoreโ€‹

You can search AVS after metadata and index data have been restored, but results will always be empty without any vector data. Therefore, the quickest path to uptime during a restore with separate index and vector data is to restore index metadata, start AVS, restore the index data, then restore the vector data (or restore index and vector data at the same time).

caution

You must shut down AVS before starting any restore operations.

  1. Use asrestore to restore the index metadata.

    asrestore -d index_meta -h AEROSPIKE_IP
  2. Start AVS to begin discovery of the restored indexes. You can use AVS to search the restored indexes while index data restoration is in progress (step 3).

    sudo systemctl start aerospike-vector-search
  3. Use asrestore to restore the index data.

    asrestore -d index_data -h AEROSPIKE_IP

    After index data has been restored, AVS is searchable but results will be empty. Start restoring vector data in the next step to populate results.

  4. Use asrestore to restore the vector data. AVS is searchable while vector data is being restored.

    asrestore -d vector_data -h AEROSPIKE_IP