Secondary indexes
Secondary indexes enable efficient queries on bin values beyond the primary key.
What is a secondary index?
By default, Aerospike retrieves records by their primary key (digest). A secondary index lets you query records by the value of a specific bin.
| Query type | Behavior |
|---|---|
| Single primary key lookup | key → record (O(1), always fast) |
| Multiple primary key lookups | keys → records (done in parallel across servers, usually fast) |
| Secondary index query | bin value → records (requires index) |
| Primary index query | (scans all records in parallel, O(N), can be slow) |
Index types
| Type | Bin Data Type | Query Operations |
|---|---|---|
IndexType.STRING / IndexTypeEnum.STRING | String | Equality (==) |
IndexType.INTEGER / IndexTypeEnum.NUMERIC | Integer | Equality and range comparisons (<, <=, >, >=) |
When to use secondary indexes
✅ Good use cases:
- Filtering by status, category, or type fields
- Range queries on timestamps or numeric IDs
- Geospatial queries (find nearby)
❌ Avoid when:
- High cardinality (millions of unique values)
- Zero or one result will be returned (eg alternate id for a record)
- Frequently updated bins
- Can use primary key lookup instead
Creating an index
Note: Creating indexes in code in production environments is often an anti-pattern due to resource consumption. Use tools like asadm to manage indexes in higher environments instead.
import com.aerospike.client.sdk.DataSet;import com.aerospike.client.sdk.query.IndexCollectionType;import com.aerospike.client.sdk.query.IndexType;
DataSet users = DataSet.of("test", "users");
// Create an integer index on the "age" binsession.createIndex(users, "age_idx", "age", IndexType.INTEGER, IndexCollectionType.DEFAULT) .waitTillComplete();
// Create a string index on the "status" binsession.createIndex(users, "status_idx", "status", IndexType.STRING, IndexCollectionType.DEFAULT) .waitTillComplete();from aerospike_sdk import DataSet
users = DataSet.of("test", "users")
# Create an integer index on the "age" binawait ( session.index(dataset=users) .on_bin("age") .named("age_idx") .numeric() .create())
# Create a string index on the "status" binawait ( session.index(dataset=users) .on_bin("status") .named("status_idx") .string() .create())Querying with indexes
Once an index exists, matching AEL predicates can use it:
import com.aerospike.client.sdk.Record;import com.aerospike.client.sdk.RecordResult;import com.aerospike.client.sdk.RecordStream;
// This query uses the "age_idx" secondary indexRecordStream stream = session.query(users) .where("$.age > 21") .execute();stream.forEach((RecordResult result) -> { if (result.isOk()) { Record row = result.recordOrThrow(); if (row != null) { // Process row (for example, row.getString("name")) } }});stream.forEach(result -> { Record row = result.recordOrThrow(); // Process row (for example, row.getString("name")) // The row will not be null (empty rows not returned by default)});# This query uses the "age_idx" secondary indexstream = await session.query(users).where("$.age > 21").execute()async for row in stream: record = row.record_or_raise() pass # Process record.bins, etc.stream.close()Complete example
import com.aerospike.client.sdk.AerospikeException;import com.aerospike.client.sdk.Cluster;import com.aerospike.client.sdk.ClusterDefinition;import com.aerospike.client.sdk.DataSet;import com.aerospike.client.sdk.Record;import com.aerospike.client.sdk.RecordResult;import com.aerospike.client.sdk.RecordStream;import com.aerospike.client.sdk.Session;import com.aerospike.client.sdk.policy.Behavior;import com.aerospike.client.sdk.query.IndexCollectionType;import com.aerospike.client.sdk.query.IndexType;
public class SecondaryIndexExample { public static void main(String[] args) { try (Cluster cluster = new ClusterDefinition("localhost", 3000).connect()) { Session session = cluster.createSession(Behavior.DEFAULT); DataSet users = DataSet.of("test", "users"); String k1 = "sidx-example-1"; String k2 = "sidx-example-2"; String k3 = "sidx-example-3"; String ageIndex = "age_idx_demo"; String statusIndex = "status_idx_demo";
// Cleanup so the example is repeatable. session.delete(users.ids(k1, k2, k3)).execute().close(); try { session.dropIndex(users, ageIndex).waitTillComplete(); } catch (AerospikeException ignored) { // Index may not exist yet. } try { session.dropIndex(users, statusIndex).waitTillComplete(); } catch (AerospikeException ignored) { // Index may not exist yet. }
// Seed sample data. session.insert(users) .bins("name", "age", "status") .id(k1).values("Alice", 28, "sidx_active") .id(k2).values("Bob", 42, "sidx_active") .id(k3).values("Carol", 19, "sidx_inactive") .execute();
// Create secondary indexes. session.createIndex(users, ageIndex, "age", IndexType.INTEGER, IndexCollectionType.DEFAULT) .waitTillComplete(); session.createIndex(users, statusIndex, "status", IndexType.STRING, IndexCollectionType.DEFAULT) .waitTillComplete();
// Query using age index. System.out.println("sidx_active users age >= 30:"); RecordStream ageQuery = session.query(users) .where("$.status == 'sidx_active' and $.age >= 30") .readingOnlyBins("name", "age") .execute(); ageQuery.forEach(result -> { Record user = result.recordOrThrow(); System.out.println(" - " + user.getString("name") + " (" + user.getInt("age") + ")"); });
// Query using status index. System.out.println("\nsidx_active users:"); RecordStream statusQuery = session.query(users) .where("$.status == 'sidx_active'") .readingOnlyBins("name", "status") .execute(); statusQuery.forEach(result -> { Record user = result.recordOrThrow(); System.out.println(" - " + user.getString("name")); }); } }}import asyncio
from aerospike_sdk import Behavior, Client, DataSet
async def main(): async with Client("localhost:3000") as client: session = client.create_session(Behavior.DEFAULT) users = DataSet.of("test", "users") k1 = users.id("sidx-example-1") k2 = users.id("sidx-example-2") k3 = users.id("sidx-example-3") age_index = "age_idx_demo" status_index = "status_idx_demo"
# Cleanup so the example is repeatable. stream = await session.delete(key=k1).execute() stream.close() stream = await session.delete(key=k2).execute() stream.close() stream = await session.delete(key=k3).execute() stream.close()
# Seed sample data. await session.insert(key=k1).put({"name": "Alice", "age": 28, "status": "sidx_active"}).execute() await session.insert(key=k2).put({"name": "Bob", "age": 42, "status": "sidx_active"}).execute() await session.insert(key=k3).put({"name": "Carol", "age": 19, "status": "sidx_inactive"}).execute()
# Create secondary indexes. If an index already exists, create() may raise. try: await session.index(dataset=users).on_bin("age").named(age_index).numeric().create() except Exception: pass try: await session.index(dataset=users).on_bin("status").named(status_index).string().create() except Exception: pass
# Query using age index. print("sidx_active users age >= 30:") stream = await ( session.query(users) .where("$.status == 'sidx_active' and $.age >= 30") .bins(["name", "age"]) .execute() ) async for row in stream: record = row.record_or_raise() print(f" - {record.bins.get('name')} ({record.bins.get('age')})") stream.close()
# Query using status index. print("\nsidx_active users:") stream = await ( session.query(users) .where("$.status == 'sidx_active'") .bins(["name", "status"]) .execute() ) async for row in stream: record = row.record_or_raise() print(f" - {record.bins.get('name')}") stream.close()
if __name__ == "__main__": asyncio.run(main())Performance considerations
- Indexes consume memory on every node
- Index updates add write latency
- Queries without indexes scan entire set (slow)
- Monitor index memory usage in production