Skip to main content
Loading

Manage AVS indexes

Overview

This page describes how to search across a set of vectors with Aerospike Vector Search (AVS), and how to create an index associated with them. AVS uses an index to traverse the HNSW neighborhoods to perform queries. Indexes are independent from your records, and you can create an index using the Admin client.

Required configuration

When creating an index, use the following required fields to provide details about your Aerospike Database (ASDB) cluster and specifics about your vector embeddings:

  • namespace - Namespace for creating the index which must already exist in your Aerospike cluster. See our guide to configuring Aerospike Database for more details about setting up a namespace for your index.
  • name - Name of the index. Used primarily for performing searches.
  • dimensions - Number of dimensions (length) of the vector. See our guide to generating vector embeddings for more details about determining the number of dimensions in your vector.
  • vector_distance_metric - Distance calculation used by your index. Options include:
    • SQUARED_EUCLIDEAN
    • COSINE
    • DOT_PRODUCT
    • MANHATTAN
    • HAMMING
  • vector_field - Field name of the vector in the record. See Adding records to your index for details.
caution

vector_field is limited to 15 characters. Using a name greater than 15 characters will cause errors.

Optional configuration

The following fields are available but not required when creating an index:

  • index_meta_data - Use this to store information about the index (e.g., the model used to create the vector embedding). Do not use this for searching data.
  • sets - Specify where your data is stored in Aerospike. By default this uses the null set. For more details
  • hnsw_params - Parameters for the Hierarchical Navigable Small World (HNSW) algorithm, used for approximate nearest neighbor search.
    • m - Number of bi-directional links created per level during construction. Larger ‘m’ values lead to higher recall but slower construction. Defaults to 16.
    • ef - Size of the dynamic list for the nearest neighbors (candidates) during the search phase. Larger ‘ef’ values lead to higher recall but slower search. Defaults to 100.
    • ef_construction - Size of the dynamic list for the nearest neighbors (candidates) during the index construction. Larger ‘ef_construction’ values lead to higher recall but slower construction. Defaults to 100.
    • batching_params - Parameters related to configuring batch processing, such as the maximum number of records per batch and batching interval.
      • max_records - Maximum number of records to fit in a batch. Defaults to 10000. interval
      • interval - Maximum amount of time in milliseconds to wait before finalizing a batch. Defaults to 10000. disabled
      • disabled - Disables batching for index updates. Default is False.

Example index creation

The following example creates an index using the Python admin client. For more details see the Python API documentation.

avs_admin_client.index_create(
namespace="test",
name="search-space",
vector_field="image_embedding",
dimensions=8,
vector_distance_metric=types.VectorDistanceMetric.COSINE,
sets="index-set",
index_params=types.HnswParams(
m=32,
ef_construction=200,
ef=400,
),
index_meta_data={"model-used": "CLIP"},
)
note

When creating an index, you must define the Aerospike namespace where the data will be stored. Indexes can grow quite large and take time to build. For more details about namespace and set configurations on an index, see Configure Aerospike Database for AVS.

Adding records to your index

To make your records searchable, you need to specify the appropriate set and vector field when upserting records. For example, to add vector records to the example index created in the previous section, specify the vector field as image_embedding.

Upsert example:

avs_client.upsert(
namespace="test",
key="b24f7e3a-9c38-4b0e-b5e8-d6f5b8f0a921",
record_data={
#vector field much match the one defined above
"image_embedding": [1,2,3,4,5,6,7,8],

#optional vector metadata
"image_path": f"b24f7e3a-9c38-4b0e-b5e8-d6f5b8f0a921.jpg",
"map": {"a": "A", "inlist": [1, 2, 3]},
"list": ["a", 1, "c", {"a": "A"}]
},

##optional set specification
set_name="index-set"
)