# Secondary indexes

Secondary indexes enable efficient queries on bin values beyond the primary key.

## What is a secondary index?

By default, Aerospike retrieves records by their primary key (digest). A **secondary index** lets you query records by the value of a specific bin.

| Query type | Behavior |
| --- | --- |
| Single primary key lookup | key → record (O(1), always fast) |
| Multiple primary key lookups | keys → records (done in parallel across servers, usually fast) |
| Secondary index query | bin value → records (requires index) |
| Primary index query | (scans all records in parallel, O(N), can be slow) |

## Index types

| Type | Bin Data Type | Query Operations |
| --- | --- | --- |
| `IndexType.STRING` / `IndexTypeEnum.STRING` | String | Equality (`==`) |
| `IndexType.INTEGER` / `IndexTypeEnum.NUMERIC` | Integer | Equality and range comparisons (`<`, `<=`, `>`, `>=`) |

## When to use secondary indexes

✅ **Good use cases**:

-   Filtering by status, category, or type fields
-   Range queries on timestamps or numeric IDs
-   Geospatial queries (find nearby)

❌ **Avoid when**:

-   High cardinality (millions of unique values)
-   Zero or one result will be returned (eg alternate id for a record)
-   Frequently updated bins
-   Can use primary key lookup instead

## Creating an index

Note: Creating indexes in code in production environments is often an anti-pattern due to resource consumption. Use tools like `asadm` to manage indexes in higher environments instead.

-   [Java](#tab-panel-3026)
-   [Python](#tab-panel-3027)

```java
import com.aerospike.client.sdk.DataSet;

import com.aerospike.client.sdk.query.IndexCollectionType;

import com.aerospike.client.sdk.query.IndexType;

DataSet users = DataSet.of("test", "users");

// Create an integer index on the "age" bin

session.createIndex(users, "age_idx", "age", IndexType.INTEGER, IndexCollectionType.DEFAULT)

    .waitTillComplete();

// Create a string index on the "status" bin

session.createIndex(users, "status_idx", "status", IndexType.STRING, IndexCollectionType.DEFAULT)

    .waitTillComplete();
```

```python
from aerospike_sdk import DataSet

users = DataSet.of("test", "users")

# Create an integer index on the "age" bin

await (

    session.index(dataset=users)

    .on_bin("age")

    .named("age_idx")

    .numeric()

    .create()

)

# Create a string index on the "status" bin

await (

    session.index(dataset=users)

    .on_bin("status")

    .named("status_idx")

    .string()

    .create()

)
```

## Querying with indexes

Once an index exists, matching AEL predicates can use it:

-   [Java](#tab-panel-3028)
-   [Python](#tab-panel-3029)

```java
import com.aerospike.client.sdk.Record;

import com.aerospike.client.sdk.RecordResult;

import com.aerospike.client.sdk.RecordStream;

// This query uses the "age_idx" secondary index

RecordStream stream = session.query(users)

    .where("$.age > 21")

    .execute();

stream.forEach((RecordResult result) -> {

    if (result.isOk()) {

        Record row = result.recordOrThrow();

        if (row != null) {

            // Process row (for example, row.getString("name"))

        }

    }

});

stream.forEach(result -> {

        Record row = result.recordOrThrow();

        // Process row (for example, row.getString("name"))

        // The row will not be null (empty rows not returned by default)

});
```

```python
# This query uses the "age_idx" secondary index

stream = await session.query(users).where("$.age > 21").execute()

async for row in stream:

    record = row.record_or_raise()

    pass  # Process record.bins, etc.

stream.close()
```

## Complete example

-   [Java](#tab-panel-3030)
-   [Python](#tab-panel-3031)

```java
import com.aerospike.client.sdk.AerospikeException;

import com.aerospike.client.sdk.Cluster;

import com.aerospike.client.sdk.ClusterDefinition;

import com.aerospike.client.sdk.DataSet;

import com.aerospike.client.sdk.Record;

import com.aerospike.client.sdk.RecordResult;

import com.aerospike.client.sdk.RecordStream;

import com.aerospike.client.sdk.Session;

import com.aerospike.client.sdk.policy.Behavior;

import com.aerospike.client.sdk.query.IndexCollectionType;

import com.aerospike.client.sdk.query.IndexType;

public class SecondaryIndexExample {

    public static void main(String[] args) {

        try (Cluster cluster = new ClusterDefinition("localhost", 3000).connect()) {

            Session session = cluster.createSession(Behavior.DEFAULT);

            DataSet users = DataSet.of("test", "users");

            String k1 = "sidx-example-1";

            String k2 = "sidx-example-2";

            String k3 = "sidx-example-3";

            String ageIndex = "age_idx_demo";

            String statusIndex = "status_idx_demo";

            // Cleanup so the example is repeatable.

            session.delete(users.ids(k1, k2, k3)).execute().close();

            try {

                session.dropIndex(users, ageIndex).waitTillComplete();

            } catch (AerospikeException ignored) {

                // Index may not exist yet.

            }

            try {

                session.dropIndex(users, statusIndex).waitTillComplete();

            } catch (AerospikeException ignored) {

                // Index may not exist yet.

            }

            // Seed sample data.

            session.insert(users)

                .bins("name", "age", "status")

                .id(k1).values("Alice", 28, "sidx_active")

                .id(k2).values("Bob", 42, "sidx_active")

                .id(k3).values("Carol", 19, "sidx_inactive")

                .execute();

            // Create secondary indexes.

            session.createIndex(users, ageIndex, "age", IndexType.INTEGER, IndexCollectionType.DEFAULT)

                .waitTillComplete();

            session.createIndex(users, statusIndex, "status", IndexType.STRING, IndexCollectionType.DEFAULT)

                .waitTillComplete();

            // Query using age index.

            System.out.println("sidx_active users age >= 30:");

            RecordStream ageQuery = session.query(users)

                .where("$.status == 'sidx_active' and $.age >= 30")

                .readingOnlyBins("name", "age")

                .execute();

            ageQuery.forEach(result -> {

                Record user = result.recordOrThrow();

                System.out.println("  - " + user.getString("name") + " (" + user.getInt("age") + ")");

            });

            // Query using status index.

            System.out.println("\nsidx_active users:");

            RecordStream statusQuery = session.query(users)

                .where("$.status == 'sidx_active'")

                .readingOnlyBins("name", "status")

                .execute();

            statusQuery.forEach(result -> {

                Record user = result.recordOrThrow();

                System.out.println("  - " + user.getString("name"));

            });

        }

    }

}
```

```python
import asyncio

from aerospike_sdk import Behavior, Client, DataSet

async def main():

    async with Client("localhost:3000") as client:

        session = client.create_session(Behavior.DEFAULT)

        users = DataSet.of("test", "users")

        k1 = users.id("sidx-example-1")

        k2 = users.id("sidx-example-2")

        k3 = users.id("sidx-example-3")

        age_index = "age_idx_demo"

        status_index = "status_idx_demo"

        # Cleanup so the example is repeatable.

        stream = await session.delete(key=k1).execute()

        stream.close()

        stream = await session.delete(key=k2).execute()

        stream.close()

        stream = await session.delete(key=k3).execute()

        stream.close()

        # Seed sample data.

        await session.insert(key=k1).put({"name": "Alice", "age": 28, "status": "sidx_active"}).execute()

        await session.insert(key=k2).put({"name": "Bob", "age": 42, "status": "sidx_active"}).execute()

        await session.insert(key=k3).put({"name": "Carol", "age": 19, "status": "sidx_inactive"}).execute()

        # Create secondary indexes. If an index already exists, create() may raise.

        try:

            await session.index(dataset=users).on_bin("age").named(age_index).numeric().create()

        except Exception:

            pass

        try:

            await session.index(dataset=users).on_bin("status").named(status_index).string().create()

        except Exception:

            pass

        # Query using age index.

        print("sidx_active users age >= 30:")

        stream = await (

            session.query(users)

            .where("$.status == 'sidx_active' and $.age >= 30")

            .bins(["name", "age"])

            .execute()

        )

        async for row in stream:

            record = row.record_or_raise()

            print(f"  - {record.bins.get('name')} ({record.bins.get('age')})")

        stream.close()

        # Query using status index.

        print("\nsidx_active users:")

        stream = await (

            session.query(users)

            .where("$.status == 'sidx_active'")

            .bins(["name", "status"])

            .execute()

        )

        async for row in stream:

            record = row.record_or_raise()

            print(f"  - {record.bins.get('name')}")

        stream.close()

if __name__ == "__main__":

    asyncio.run(main())
```

## Performance considerations

-   Indexes consume memory on every node
-   Index updates add write latency
-   Queries without indexes scan entire set (slow)
-   Monitor index memory usage in production

## Next steps

-   [Query Records](https://aerospike.com/docs/develop/client/sdk/usage/query)
-   [Aerospike Expression Language (AEL)](https://aerospike.com/docs/develop/client/sdk/concepts/ael)
-   [Data Model](https://aerospike.com/docs/develop/client/sdk/concepts/data-model)