# Performance

This page covers performance considerations for [path expressions](https://aerospike.com/docs/develop/expressions/path/), including optimization strategies for common patterns like IN-list filtering.

## Optimizing IN-list filters on map keys

Aerospike 8.1.2 introduces `CTX.mapKeysIn` for selecting a subset of map keys directly within a path expression, equivalent to `WHERE key IN (k1, k2, ...)` in SQL. This context can be combined with `CTX.andFilter` to apply additional filters to the selected entries.

In the 8.1.1 preview, there was no native IN-list context, so developers had to work around this using expression primitives and loop variables. The example below shows three progressively more efficient ways to express the same IN-list query. Each produces the same result, but they differ in API clarity and performance as the map grows.

## Booking example

This example uses a hotel booking document where each record is a map keyed by room IDs (longs). Each room entry contains rates, metadata, and status flags. The document is stored in a single map bin named `"doc"`.

### Data structure

```json
{

  "10001": {

    "rates": [

      { "Id": 1, "a": 2, "beta": 3.0, "c": true, "d": "data" },

      { "Id": 2, "a": 2, "beta": 0.0, "c": true, "d": "data blob" }

    ],

    "e": 4,

    "isDeleted": false,

    "time": 1795647328

  },

  "10002": {

    "rates": [

      { "Id": 1, "a": 2, "beta": 3.0, "c": true, "d": "data blob" }

    ],

    "e": 4,

    "isDeleted": true,

    "time": 1795647328

  },

  "10003": {

    "rates": [

      { "Id": 1, "a": 2, "beta": 3.0, "c": true, "d": "data blob" }

    ],

    "e": 4,

    "isDeleted": false,

    "time": 1764140184

  }

}
```

### Filtering dimensions

```txt
doc (bin, map keyed by long room IDs)

 └── 10001 / 10002 / 10003

      ├── rates (list of maps)             <-- rate-level filter

      │    ├── { Id:1, a:2, beta:3.0, c:true, d:"data" }

      │    └── { Id:2, a:2, beta:0.0, c:true, d:"data blob" }

      ├── e: 4

      ├── isDeleted: false                 <-- room-level filter

      └── time: 1795647328                 <-- room-level filter
```

-   **Top level (IN-list)**: select specific room IDs from the map keys (e.g., `roomId IN (10001, 10003)`)
-   **Room level**: filter on `isDeleted` (boolean) and `time` (epoch timestamp threshold)
-   **Rate level**: filter on `beta > 0.0`, specific rate `Id`, etc.

### Query goal

Select only rooms whose ID is in a given list, that are not deleted, have a recent timestamp, and whose rates have `beta > 0`.

The SQL-like equivalent of this filter expression would be

```sql
WHERE roomIds IN [10001, 10003] AND time > 1780000000

AND isDeleted IS false AND beta > 0.0
```

Expected result: only room 10001 passes all filters (10002 is deleted, 10003 has old timestamp), with only the first rate (`beta=3.0`); the second rate (`beta=0.0`) is excluded.

### Shared filter expressions

```java
List<Long> roomIds = Arrays.asList(10001L, 10003L);

long timeThreshold = 1780000000L;

Exp exp1 = Exp.gt(

    MapExp.getByKey(MapReturnType.VALUE, Exp.Type.INT,

        Exp.val("time"), Exp.mapLoopVar(LoopVarPart.VALUE)),

    Exp.val(timeThreshold));

Exp exp2 = Exp.eq(

    MapExp.getByKey(MapReturnType.VALUE, Exp.Type.BOOL,

        Exp.val("isDeleted"), Exp.mapLoopVar(LoopVarPart.VALUE)),

    Exp.val(false));

Exp exp3 = Exp.gt(

    MapExp.getByKey(MapReturnType.VALUE, Exp.Type.FLOAT,

        Exp.val("beta"), Exp.mapLoopVar(LoopVarPart.VALUE)),

    Exp.val(0.0));
```

In the analysis below, **N** is the total number of entries in the map and **M** is the number of requested keys.

### Approach 1: Filter-as-you-go (8.1.1+)

Check membership inside `allChildrenWithFilter` using `ListExp.getByValue`. This traverses every room in the map and, for each one, linearly scans the `roomIds` list.

```java
Exp roomExp = Exp.gt(

    ListExp.getByValue(ListReturnType.COUNT,

        Exp.intLoopVar(LoopVarPart.MAP_KEY),

        Exp.val(roomIds)),

    Exp.val(0));

Operation op = CdtOperation.selectByPath("doc",

    SelectFlags.MATCHING_TREE | SelectFlags.NO_FAIL,

    CTX.allChildrenWithFilter(Exp.and(exp1, exp2, roomExp)),

    CTX.allChildren(),

    CTX.allChildrenWithFilter(exp3));

Record record = client.operate(null, key, op);
```

`allChildrenWithFilter(Exp.and(exp1, exp2, roomExp))`:

1.  Evaluates **every** room entry in the map — all N rooms.
2.  For each room, evaluates the full filter including `roomExp`, which calls `ListExp.getByValue(COUNT, currentMapKey, roomIdsList)` — a linear scan of the M-element `roomIds` list.
3.  All three filter expressions (`exp1`, `exp2`, `roomExp`) are evaluated for every room, even those not in the target list.

**Complexity: O(N × M)** — N rooms visited, each with an O(M) membership scan.

### Approach 2: Pre-filter then traverse (8.1.1+, optimized)

The bottleneck in Approach 1 is that every room is visited regardless of whether it is in the target list. Approach 2 eliminates this by extracting only the matching rooms **before** the path expression runs: `MapExp.getByKeyList` performs an index-based lookup on the map, then `CdtExp.selectByPath` applies the remaining filters to the reduced result.

This requires switching from `CdtOperation.selectByPath` to the expression-based `CdtExp.selectByPath` wrapped in `ExpOperation.read`, because the operation-based API does not accept a pre-filtered map as input.

```java
Expression readExp = Exp.build(

    CdtExp.selectByPath(Exp.Type.MAP,

        SelectFlags.MATCHING_TREE | SelectFlags.NO_FAIL,

        MapExp.getByKeyList(MapReturnType.ORDERED_MAP,

            Exp.val(roomIds),

            Exp.mapBin("doc")),

        CTX.allChildrenWithFilter(Exp.and(exp1, exp2)),

        CTX.allChildren(),

        CTX.allChildrenWithFilter(exp3)));

Operation op = ExpOperation.read("doc", readExp, ExpReadFlags.DEFAULT);

Record record = client.operate(null, key, op);
```

1.  `MapExp.getByKeyList` uses the map’s index to look up each of the M requested keys — **O(M log N)** for an ordered map.
2.  The path expression then operates on the M matching rooms, not all N.
3.  The remaining filters (`exp1`, `exp2`) are evaluated on only M rooms, and the `roomExp` membership check is gone entirely — the pre-filter already guarantees only matching rooms are present.

**Complexity: O(M log N)** for the key lookup, plus O(M) for the downstream path traversal — dominated by the lookup step.

Compared to Approach 1:

-   **Index-based lookup vs brute-force scan:** `getByKeyList` pulls M keys directly via the index. Approach 1 visits every room and linearly scans the `roomIds` list for each one.
-   **Smaller working set:** Every subsequent CTX level (`allChildren`, `allChildrenWithFilter(exp3)`) operates on M rooms instead of N. For a map with 10,000 rooms and 10 requested IDs, this means roughly 1,000× less downstream work.
-   **Eliminated membership check:** Approach 1 evaluates three filter expressions per room (`exp1`, `exp2`, `roomExp`); Approach 2 evaluates only two (`exp1`, `exp2`).

When N is large (for example, thousands of rooms in the document) but M is small (for example, 10 requested room IDs), this difference is significant.

For small maps (tens of entries), Approach 1 is simpler and the performance difference is negligible. Approach 2 pays off when the map is large relative to the number of keys you are selecting.

### Approach 3: Native key selection (8.1.2+)

Use `CTX.mapKeysIn` to select room IDs directly in the path context, with `CTX.andFilter` for additional filtering. The key lookup happens natively inside the path expression, giving the cleanest API and best performance. Requires server 8.1.2+.

```java
long[] roomIdArray = roomIds.stream().mapToLong(Long::longValue).toArray();

Operation op = CdtOperation.selectByPath("doc",

    SelectFlags.MATCHING_TREE | SelectFlags.NO_FAIL,

    CTX.mapKeysIn(roomIdArray),

    CTX.andFilter(Exp.and(exp1, exp2)),

    CTX.allChildren(),

    CTX.allChildrenWithFilter(exp3));

Record record = client.operate(null, key, op);
```

Approach 3 improves on Approach 2 in two ways:

-   **No expression pre-filter step:** Approach 2 must first call `MapExp.getByKeyList` inside an expression wrapper (`CdtExp.selectByPath` + `ExpOperation.read`) to extract the M matching keys before the path traversal can begin. The server builds, evaluates, and tears down that intermediate expression on every call. Approach 3 replaces all of this with `CTX.mapKeysIn`, which performs the same key selection natively inside the path context, eliminating the expression overhead entirely.
-   **Simpler API:** Approach 3 uses `CdtOperation.selectByPath` — the same straightforward operation-based API as Approach 1 — instead of the expression-based wrapping that Approach 2 requires. The code is shorter, easier to read, and follows the same pattern developers already know.

In Approach 3 there is no reason to use `CdtExp.selectByPath`: `CTX.mapKeysIn` already handles key selection inside the path context, so the expression wrapper would add complexity and CPU cost for no benefit.

`CTX.andFilter` applies additional filters at the same level as the preceding context. Note that `CTX.andFilter` cannot be chained after another `CTX.andFilter` and cannot follow a `CTX.allChildrenWithFilter`. For full usage constraints, see [key selection and combined filtering](https://aerospike.com/docs/develop/data-types/collections/context#key-selection-and-combined-filtering-812).

### Result

All three approaches return the same result. Only room 10001 passes all room-level filters (10002 is deleted, 10003 has an old timestamp). Within that room, only the first rate (`beta=3.0`) passes the rate-level filter; the second rate (`beta=0.0`) is excluded.

```json
{

  "10001": {

    "time": 1795647328,

    "isDeleted": false,

    "e": 4,

    "rates": [

      { "a": 2, "Id": 1, "c": true, "d": "data", "beta": 3.0 }

    ]

  }

}
```

## General performance characteristics

Path expressions execute server-side, which reduces network transfer by returning only matching elements. For deeply nested structures, this can significantly reduce latency compared to fetching entire records and filtering client-side.

Performance depends on:

-   **CDT size:** Larger Maps/Lists require more processing time.
-   **Nesting depth:** Deeper structures take longer to traverse.
-   **Filter complexity:** Complex boolean expressions add overhead.
-   **Result size:** Returning `MATCHING_TREE` transfers more data than `MAP_KEY`.

For performance-critical applications, benchmark with realistic data volumes.

### Batch inlining with large records

Path expression workloads typically involve nested CDT documents well above 1KiB per record. When you batch these operations, the default batch policy `allowInline=true` serializes the entire sub-batch on a single service thread for [in-memory namespaces](https://aerospike.com/docs/database/learn/architecture/data-storage/data-model/#namespaces). For records larger than about 1KiB this is slower than non-inlined execution, which distributes the sub-batch across multiple service threads.

Set `allowInline` to `false` (or the language-equivalent batch policy flag) when batching path expression operations against an in-memory namespace whose records exceed about 1KiB. For SSD-based namespaces the default `allowInlineSSD=false` already disables inlining.

See [Inlining batches](https://aerospike.com/docs/develop/learn/batch#inlining-batches) for full details on batch inline policies and when inlining helps versus hurts.

## Limits

**Maximum nesting depth:** Path expressions support up to 15 levels of CDT nesting, matching the Aerospike CDT context depth limit.

**Element count:** There is no hard limit on the number of elements, but very large CDTs (millions of elements) may impact latency. If you are working with extremely large collections, consider partitioning data across multiple records or using secondary indexes to narrow the scope before applying path expressions.