Use the Python client
AVS provides a gRPC API and a Python client that developers can use to create AI applications leveraging the search capability. This page describes the basic usage of the AVS Python client, including, writing, reading, and searching for vectors with AVS.
You can try all of the example code on this page using this interactive Jupyter notebook.
Prerequisites​
- Python 3.9 or higher
- pip 9.0.1 or higher
- A running AVS deployment (see Install AVS)
Importing Python clients​
Install the AVS package from PyPI:
pip install aerospike-vector-search
The client package provides two separate clients:
avs_client performs database operations with vector data. The client supports Hierarchical Navigable Small World (HNSW) vector searches, allowing users to find vectors similar to a given query vector within an index.
avs_admin_client conducts AVS administrative operations, such as creating indexes, querying index information, and dropping indexes.
The client package also provides a types
module that contains classes necessary for interacting with the various client APIs.
from aerospike_vector_search import types
from aerospike_vector_search import AdminClient, Client
Creating a Vector admin client​
Initialize a new client by providing one or more seed hosts to which the client can connect.
# Admin client configuration
# LISTENER_NAME corresponds to the AVS advertised_listener config.
# https://aerospike.com/docs/vector/operate/configuration#advertised-listener
# this is often needed when connection to AVS clusters in the cloud
LISTENER_NAME = None
# LOAD_BALANCED is True if the AVS cluster is load balanced
# using a load balancer with AVS is best practice and even works
# with a single node AVS cluster that is not load balanced
LOAD_BALANCED = True
admin_client = AdminClient(
seeds=types.HostPort(host=AVS_HOST, port=AVS_PORT),
listener_name=LISTENER_NAME,
is_loadbalancer=LOAD_BALANCED,
)
Once initialized, admin_client
is ready for use.
Creating an index using the admin client​
To search across a set of vectors, you need to create an index associated with those vectors. AVS uses an index to traverse the HNSW neighborhoods to perform queries.
See Manage AVS indexes for details about creating an index.
Example:
# Index creation arguments
# NAMESPACE is the namespace that the indexed data will be stored in
NAMESPACE = "test"
# INDEX_NAME is the name of the HNSW index to create
INDEX_NAME = "basic_index"
# VECTOR_FIELD is the Aerospike record bin that stores its vector data
# The created index will use the data in this bin to perform nearest neighbor searches etc
VECTOR_FIELD = "vector"
# DIMENSIONS is the dimensionality of the vectors
DIMENSIONS = 2
try:
print("creating index")
admin_client.index_create(
namespace=NAMESPACE,
name=INDEX_NAME,
vector_field=VECTOR_FIELD,
dimensions=DIMENSIONS,
)
except Exception as e:
print("failed creating index " + str(e))
pass
Creating a Vector client​
Initialize a new client by providing one or more seed hosts to which the client can connect.
client = Client(
seeds=types.HostPort(host=AVS_HOST, port=AVS_PORT),
listener_name=LISTENER_NAME,
is_loadbalancer=LOAD_BALANCED,
)
Once initialized, client
is ready for use.
Adding vector entries​
Vectors must exist in AVS before searches can be performed.
To insert records, use the upsert
method and specify the following values when writing a record:
namespace
- Namespace in which the index exists.key
- Primary identifier for your record.record data
- Map of any data you want to associate with your vector.setName
(optional) - Set in which to place the record.
The following call creates an index:
# set_name is the Aerospike set to write the records to
SET_NAME = "basic-set"
print("inserting vectors")
for i in range(10):
key = "r" + str(i)
client.upsert(
namespace=NAMESPACE,
set_name=SET_NAME,
key=key,
record_data={
"url": f"http://host.com/data{i}",
"vector": [i * 1.0, i * 1.0],
"map": {"a": "A", "inlist": [1, 2, 3]},
"list": ["a", 1, "c", {"a": "A"}],
},
)
Waiting for index construction​
After inserting vectors into AVS, it will take some time to build the index. If the index is not complete, vector search results may be inaccurate. If you are running a batch job and want confirmation that index construction is complete, you can do the following:
print("waiting for indexing to complete")
client.wait_for_index_completion(namespace=NAMESPACE, name=INDEX_NAME)
Waiting for the index to complete may provide more accurate search results.
Checking if a vector is indexed​​
Alternatively, you can check individual records to see if they have completed indexing.
status = client.is_indexed(
namespace=NAMESPACE,
set_name=SET_NAME,
key=key,
index_name=INDEX_NAME,
)
print("indexed: ", status)
Searching​
After vectors have been indexed, you can begin searching them by providing a vector for search. This generally entails running your machine learning model on user input, and then performing a search using the generated embedding.
print("querying")
for i in range(10):
print(" query " + str(i))
results = client.vector_search(
namespace=NAMESPACE,
index_name=INDEX_NAME,
query=[i * 1.0, i * 1.0],
limit=3,
)
for result in results:
print(str(result.key.key) + " -> " + str(result.fields))
Results are a list of nearest neighbors. You can loop through the results from your entries to extract the relevant properties to use in your application:
for result in results:
print(str(result.key) + " -> " + str(result.bins))
Get vector data​
You can read a record from AVS using the following:
key = "r0"
result = client.get(
namespace=NAMESPACE,
key=key,
set_name=SET_NAME,
)
print(str(result.key.key) + " -> " + str(result.fields))
AVS Python Client using Asyncio​
The aerospike-vector-search
module provides an aio module with asynchronous clients that replace any client methods with coroutine methods. The asynchronous client are initialized in the same way as the synchronous clients. Simply add await
in front of synchronous code to convert code examples:
from aerospike_vector_search.aio import Client as asyncClient
async_client = asyncClient(
seeds=types.HostPort(host=AVS_HOST, port=AVS_PORT),
listener_name=LISTENER_NAME,
is_loadbalancer=LOAD_BALANCED,
)
# Use await on client methods to await completion of the coroutine
results = await async_client.vector_search(
namespace=NAMESPACE,
index_name=INDEX_NAME,
query=[8.0, 8.0],
limit=3,
)
for result in results:
print(str(result.key.key) + " -> " + str(result.fields))
Read the Docs​
For details about using the Python client, visit our Read the Docs page.