Skip to main content
Loading

AVS search caching

To improve search performance, the neighborhoods for a search are cached on the search nodes. This means that similar searches — searches that target the same neighborhoods — can be read from the cache rather than making a round trip to the storage layer. This caching approach, based on locality, uses the geometric distance between neighborhoods and is referred to as "geometric caching."

In addition to geometric caching, Aerospike Vector Search (AVS) directs searches to the node most likely to have the index cached. This prevents additional network hops on queries and further improves performance.

Geometric allocation

The HNSW index is cached on AVS nodes based on the location of each neighborhood layer using a k-means distribution algorithm known as the Voronoi technique. This establishes "centroids" for different areas of the cache, and caches neighborhoods of the index that are close to one another. With this approach, a cache distribution can scale effectively both horizontally and vertically.

image

tip

By default, AVS only caches the HNSW index records. For optimal performance, you can configure caching vector records

Query steering

To take advantage of the geometric distribution of the HNSW caching layers, each AVS node is aware of the best location in the index to perform the query. AVS nodes route the query to the appropriate node when performing an HNSW traversal.

note

Query steering occurs regardless of whether or not you use a load balancer. See details about using a load balancer in our system diagram.

Query steering allows AVS to scale horizontally for performance and throughput without having to add more nodes.

tip

You can monitor for query steering by filtering for proxy-out requests on an AVS node.

All in-memory caching

For optimal performance, we recommend that you scale your cache size to approach 100% of cache hits. This depends on the nature of your dataset and data pipeline, and is ideal for smaller datasets that are relatively static. To achieve this, set a large cache size and a long or infinite expiration.

For details about cache configuration options, see Configuring AVS.