Secondary indexes

For the complete documentation index see: llms.txt

All documentation pages available in markdown.

Secondary indexes enable efficient queries on bin values beyond the primary key.

What is a secondary index?

By default, Aerospike retrieves records by their primary key (digest). A secondary index lets you query records by the value of a specific bin.

Query type	Behavior
Single primary key lookup	key → record (O(1), always fast)
Multiple primary key lookups	keys → records (done in parallel across servers, usually fast)
Secondary index query	bin value → records (requires index)
Primary index query	(scans all records in parallel, O(N), can be slow)

Index types

Type	Bin data type	Query operations
`IndexType.STRING` / `IndexTypeEnum.STRING`	String	Equality (`==`)
`IndexType.INTEGER` / `IndexTypeEnum.NUMERIC`	Integer	Equality and range comparisons (`<`, `<=`, `>`, `>=`)
`IndexType.GEO2DSPHERE` / `IndexTypeEnum.GEO2DSPHERE`	GeoJSON	Within radius, within polygon (`geoCompare` in AEL)

When to use secondary indexes

✅ Good use cases:

Filtering by status, category, or type fields
Range queries on timestamps or numeric IDs
Geospatial queries (find nearby)

❌ Avoid when:

High cardinality (millions of unique values)
Zero or one result will be returned (eg alternate id for a record)
Frequently updated bins
Can use primary key lookup instead

Creating an index

Note: Creating indexes in code in production environments is often an anti-pattern due to resource consumption. Use tools like asadm to manage indexes in higher environments instead.

Java
Python

import com.aerospike.client.sdk.DataSet;
import com.aerospike.client.sdk.query.IndexCollectionType;
import com.aerospike.client.sdk.query.IndexType;

DataSet users = DataSet.of("test", "users");

// Create an integer index on the "age" bin
session.createIndex(users, "age_idx", "age", IndexType.INTEGER, IndexCollectionType.DEFAULT)
    .waitTillComplete();

// Create a string index on the "status" bin
session.createIndex(users, "status_idx", "status", IndexType.STRING, IndexCollectionType.DEFAULT)
    .waitTillComplete();

// Create a GEO2DSPHERE index on a GeoJSON bin
DataSet places = DataSet.of("test", "places");
session.createIndex(places, "places_loc_idx", "loc", IndexType.GEO2DSPHERE, IndexCollectionType.DEFAULT)
    .waitTillComplete();

📖 API reference: DataSet.of(...) | Session.createIndex(...)

from aerospike_sdk import DataSet

users = DataSet.of("test", "users")

# Create an integer index on the "age" bin
await (
    session.index(dataset=users)
    .on_bin("age")
    .named("age_idx")
    .numeric()
    .create()
)

# Create a string index on the "status" bin
await (
    session.index(dataset=users)
    .on_bin("status")
    .named("status_idx")
    .string()
    .create()
)

# Create a GEO2DSPHERE index on a GeoJSON bin
places = DataSet.of("test", "places")
await (
    session.index(dataset=places)
    .on_bin("loc")
    .named("places_loc_idx")
    .geo2dsphere()
    .create()
)

📖 API reference: DataSet.of() | Session.index()

Querying with indexes

Once an index exists, matching AEL predicates can use it:

Java
Python

import com.aerospike.client.sdk.Record;
import com.aerospike.client.sdk.RecordResult;
import com.aerospike.client.sdk.RecordStream;

// This query uses the "age_idx" secondary index
RecordStream stream = session.query(users)
    .where("$.age > 21")
    .execute();
stream.forEach((RecordResult result) -> {
    if (result.isOk()) {
        Record row = result.recordOrThrow();
        if (row != null) {
            // Process row (for example, row.getString("name"))
        }
    }
});
stream.forEach(result -> {
        Record row = result.recordOrThrow();
        // Process row (for example, row.getString("name"))
        // The row will not be null (empty rows not returned by default)
});

📖 API reference: Session.query(DataSet) | ChainableQueryBuilder.where(...) | ChainableQueryBuilder.execute() | RecordStream.forEach(...) | RecordResult.isOk() | RecordResult.recordOrThrow() | Record.getString(...)

# This query uses the "age_idx" secondary index
stream = await session.query(users).where("$.age > 21").execute()
async for row in stream:
    record = row.record_or_raise()
    pass  # Process record.bins, etc.
stream.close()

📖 API reference: Session.query() | QueryBuilder.where() | RecordResult.record_or_raise() | RecordStream.close() | QueryBuilder.execute()

import com.aerospike.client.sdk.AerospikeException;
import com.aerospike.client.sdk.Cluster;
import com.aerospike.client.sdk.ClusterDefinition;
import com.aerospike.client.sdk.DataSet;
import com.aerospike.client.sdk.Record;
import com.aerospike.client.sdk.RecordResult;
import com.aerospike.client.sdk.RecordStream;
import com.aerospike.client.sdk.Session;
import com.aerospike.client.sdk.policy.Behavior;
import com.aerospike.client.sdk.query.IndexCollectionType;
import com.aerospike.client.sdk.query.IndexType;

public class SecondaryIndexExample {
    public static void main(String[] args) {
        try (Cluster cluster = new ClusterDefinition("localhost", 3000).connect()) {
            Session session = cluster.createSession(Behavior.DEFAULT);
            DataSet users = DataSet.of("test", "users");
            String k1 = "sidx-example-1";
            String k2 = "sidx-example-2";
            String k3 = "sidx-example-3";
            String ageIndex = "age_idx_demo";
            String statusIndex = "status_idx_demo";

            // Cleanup so the example is repeatable.
            session.delete(users.ids(k1, k2, k3)).execute().close();
            try {
                session.dropIndex(users, ageIndex).waitTillComplete();
            } catch (AerospikeException ignored) {
                // Index may not exist yet.
            }
            try {
                session.dropIndex(users, statusIndex).waitTillComplete();
            } catch (AerospikeException ignored) {
                // Index may not exist yet.
            }

            // Seed sample data.
            session.insert(users)
                .bins("name", "age", "status")
                .id(k1).values("Alice", 28, "sidx_active")
                .id(k2).values("Bob", 42, "sidx_active")
                .id(k3).values("Carol", 19, "sidx_inactive")
                .execute();

            // Create secondary indexes.
            session.createIndex(users, ageIndex, "age", IndexType.INTEGER, IndexCollectionType.DEFAULT)
                .waitTillComplete();
            session.createIndex(users, statusIndex, "status", IndexType.STRING, IndexCollectionType.DEFAULT)
                .waitTillComplete();

            // Query using age index.
            System.out.println("sidx_active users age >= 30:");
            RecordStream ageQuery = session.query(users)
                .where("$.status == 'sidx_active' and $.age >= 30")
                .readingOnlyBins("name", "age")
                .execute();
            ageQuery.forEach(result -> {
                Record user = result.recordOrThrow();
                System.out.println("  - " + user.getString("name") + " (" + user.getInt("age") + ")");
            });

            // Query using status index.
            System.out.println("\nsidx_active users:");
            RecordStream statusQuery = session.query(users)
                .where("$.status == 'sidx_active'")
                .readingOnlyBins("name", "status")
                .execute();
            statusQuery.forEach(result -> {
                Record user = result.recordOrThrow();
                System.out.println("  - " + user.getString("name"));
            });
        }
    }
}

📖 API reference: ClusterDefinition(String,int) | ClusterDefinition.connect() | Cluster.createSession(Behavior) | Cluster.close() | DataSet.of(...) | DataSet.ids(...) | Session.insert(DataSet) | Session.delete(List) | Session.delete(Key) | Session.createIndex(...) | Session.dropIndex(...) | Session.query(DataSet) | OperationObjectBuilder.bins(...) | IdValuesBuilder.id(...) | IdValuesRowBuilder.values(...) | ChainableQueryBuilder.where(...) | ChainableQueryBuilder.readingOnlyBins(...) | ChainableQueryBuilder.execute() | ChainableNoBinsBuilder.execute() | RecordStream.forEach(...) | RecordStream.close() | RecordResult.recordOrThrow() | Record.getString(...) | Record.getInt(...) | AerospikeException

import asyncio

from aerospike_sdk import Behavior, Client, DataSet


async def main():
    async with Client("localhost:3000") as client:
        session = client.create_session(Behavior.DEFAULT)
        users = DataSet.of("test", "users")
        k1 = users.id("sidx-example-1")
        k2 = users.id("sidx-example-2")
        k3 = users.id("sidx-example-3")
        age_index = "age_idx_demo"
        status_index = "status_idx_demo"

        # Cleanup so the example is repeatable.
        stream = await session.delete(key=k1).execute()
        stream.close()
        stream = await session.delete(key=k2).execute()
        stream.close()
        stream = await session.delete(key=k3).execute()
        stream.close()

        # Seed sample data.
        await session.insert(key=k1).put({"name": "Alice", "age": 28, "status": "sidx_active"}).execute()
        await session.insert(key=k2).put({"name": "Bob", "age": 42, "status": "sidx_active"}).execute()
        await session.insert(key=k3).put({"name": "Carol", "age": 19, "status": "sidx_inactive"}).execute()

        # Create secondary indexes. If an index already exists, create() may raise.
        try:
            await session.index(dataset=users).on_bin("age").named(age_index).numeric().create()
        except Exception:
            pass
        try:
            await session.index(dataset=users).on_bin("status").named(status_index).string().create()
        except Exception:
            pass

        # Query using age index.
        print("sidx_active users age >= 30:")
        stream = await (
            session.query(users)
            .where("$.status == 'sidx_active' and $.age >= 30")
            .bins(["name", "age"])
            .execute()
        )
        async for row in stream:
            record = row.record_or_raise()
            print(f"  - {record.bins.get('name')} ({record.bins.get('age')})")
        stream.close()

        # Query using status index.
        print("\nsidx_active users:")
        stream = await (
            session.query(users)
            .where("$.status == 'sidx_active'")
            .bins(["name", "status"])
            .execute()
        )
        async for row in stream:
            record = row.record_or_raise()
            print(f"  - {record.bins.get('name')}")
        stream.close()


if __name__ == "__main__":
    asyncio.run(main())

📖 API reference: Client | Client.create_session() | DataSet.of() | DataSet.id() | Behavior.DEFAULT | Session.query() | Session.insert() | Session.delete() | Session.index() | QueryBuilder.where() | QueryBuilder.bins() | WriteSegmentBuilder.put() | RecordResult.record_or_raise() | RecordStream.close() | QueryBuilder.execute() | WriteSegmentBuilder.execute()

Performance considerations

Indexes consume memory on every node
Index updates add write latency
Queries without indexes scan entire set (slow)
Monitor index memory usage in production