Skip to content

Performance

This page covers performance considerations for path expressions, including optimization strategies for common patterns like IN-list filtering.

Optimizing IN-list filters on map keys

Aerospike 8.1.2 introduces CTX.mapKeysIn for selecting a subset of map keys directly within a path expression, equivalent to WHERE key IN (k1, k2, ...) in SQL. This context can be combined with CTX.andFilter to apply additional filters to the selected entries.

In the 8.1.1 preview, there was no native IN-list context, so developers had to work around this using expression primitives and loop variables. The example below shows three progressively more efficient ways to express the same IN-list query. Each produces the same result, but they differ in API clarity and performance as the map grows.

Booking example

This example uses a hotel booking document where each record is a map keyed by room IDs (longs). Each room entry contains rates, metadata, and status flags. The document is stored in a single map bin named "doc".

Data structure

{
"10001": {
"rates": [
{ "Id": 1, "a": 2, "beta": 3.0, "c": true, "d": "data" },
{ "Id": 2, "a": 2, "beta": 0.0, "c": true, "d": "data blob" }
],
"e": 4,
"isDeleted": false,
"time": 1795647328
},
"10002": {
"rates": [
{ "Id": 1, "a": 2, "beta": 3.0, "c": true, "d": "data blob" }
],
"e": 4,
"isDeleted": true,
"time": 1795647328
},
"10003": {
"rates": [
{ "Id": 1, "a": 2, "beta": 3.0, "c": true, "d": "data blob" }
],
"e": 4,
"isDeleted": false,
"time": 1764140184
}
}

Filtering dimensions

doc (bin, map keyed by long room IDs)
└── 10001 / 10002 / 10003
├── rates (list of maps) <-- rate-level filter
│ ├── { Id:1, a:2, beta:3.0, c:true, d:"data" }
│ └── { Id:2, a:2, beta:0.0, c:true, d:"data blob" }
├── e: 4
├── isDeleted: false <-- room-level filter
└── time: 1795647328 <-- room-level filter
  • Top level (IN-list): select specific room IDs from the map keys (e.g., roomId IN (10001, 10003))
  • Room level: filter on isDeleted (boolean) and time (epoch timestamp threshold)
  • Rate level: filter on beta > 0.0, specific rate Id, etc.

Query goal

Select only rooms whose ID is in a given list, that are not deleted, have a recent timestamp, and whose rates have beta > 0.

The SQL-like equivalent of this filter expression would be

WHERE roomIds IN [10001, 10003] AND time > 1780000000
AND isDeleted IS false AND beta > 0.0

Expected result: only room 10001 passes all filters (10002 is deleted, 10003 has old timestamp), with only the first rate (beta=3.0); the second rate (beta=0.0) is excluded.

Shared filter expressions

List<Long> roomIds = Arrays.asList(10001L, 10003L);
long timeThreshold = 1780000000L;
Exp exp1 = Exp.gt(
MapExp.getByKey(MapReturnType.VALUE, Exp.Type.INT,
Exp.val("time"), Exp.mapLoopVar(LoopVarPart.VALUE)),
Exp.val(timeThreshold));
Exp exp2 = Exp.eq(
MapExp.getByKey(MapReturnType.VALUE, Exp.Type.BOOL,
Exp.val("isDeleted"), Exp.mapLoopVar(LoopVarPart.VALUE)),
Exp.val(false));
Exp exp3 = Exp.gt(
MapExp.getByKey(MapReturnType.VALUE, Exp.Type.FLOAT,
Exp.val("beta"), Exp.mapLoopVar(LoopVarPart.VALUE)),
Exp.val(0.0));

In the analysis below, N is the total number of entries in the map and M is the number of requested keys.

Approach 1: Filter-as-you-go (8.1.1+)

Check membership inside allChildrenWithFilter using ListExp.getByValue. This traverses every room in the map and, for each one, linearly scans the roomIds list.

Exp roomExp = Exp.gt(
ListExp.getByValue(ListReturnType.COUNT,
Exp.intLoopVar(LoopVarPart.MAP_KEY),
Exp.val(roomIds)),
Exp.val(0));
Operation op = CdtOperation.selectByPath("doc",
SelectFlags.MATCHING_TREE | SelectFlags.NO_FAIL,
CTX.allChildrenWithFilter(Exp.and(exp1, exp2, roomExp)),
CTX.allChildren(),
CTX.allChildrenWithFilter(exp3));
Record record = client.operate(null, key, op);

allChildrenWithFilter(Exp.and(exp1, exp2, roomExp)):

  1. Evaluates every room entry in the map — all N rooms.
  2. For each room, evaluates the full filter including roomExp, which calls ListExp.getByValue(COUNT, currentMapKey, roomIdsList) — a linear scan of the M-element roomIds list.
  3. All three filter expressions (exp1, exp2, roomExp) are evaluated for every room, even those not in the target list.

Complexity: O(N × M) — N rooms visited, each with an O(M) membership scan.

Approach 2: Pre-filter then traverse (8.1.1+, optimized)

The bottleneck in Approach 1 is that every room is visited regardless of whether it is in the target list. Approach 2 eliminates this by extracting only the matching rooms before the path expression runs: MapExp.getByKeyList performs an index-based lookup on the map, then CdtExp.selectByPath applies the remaining filters to the reduced result.

This requires switching from CdtOperation.selectByPath to the expression-based CdtExp.selectByPath wrapped in ExpOperation.read, because the operation-based API does not accept a pre-filtered map as input.

Expression readExp = Exp.build(
CdtExp.selectByPath(Exp.Type.MAP,
SelectFlags.MATCHING_TREE | SelectFlags.NO_FAIL,
MapExp.getByKeyList(MapReturnType.ORDERED_MAP,
Exp.val(roomIds),
Exp.mapBin("doc")),
CTX.allChildrenWithFilter(Exp.and(exp1, exp2)),
CTX.allChildren(),
CTX.allChildrenWithFilter(exp3)));
Operation op = ExpOperation.read("doc", readExp, ExpReadFlags.DEFAULT);
Record record = client.operate(null, key, op);
  1. MapExp.getByKeyList uses the map’s index to look up each of the M requested keys — O(M log N) for an ordered map.
  2. The path expression then operates on the M matching rooms, not all N.
  3. The remaining filters (exp1, exp2) are evaluated on only M rooms, and the roomExp membership check is gone entirely — the pre-filter already guarantees only matching rooms are present.

Complexity: O(M log N) for the key lookup, plus O(M) for the downstream path traversal — dominated by the lookup step.

Compared to Approach 1:

  • Index-based lookup vs brute-force scan: getByKeyList pulls M keys directly via the index. Approach 1 visits every room and linearly scans the roomIds list for each one.
  • Smaller working set: Every subsequent CTX level (allChildren, allChildrenWithFilter(exp3)) operates on M rooms instead of N. For a map with 10,000 rooms and 10 requested IDs, this means roughly 1,000× less downstream work.
  • Eliminated membership check: Approach 1 evaluates three filter expressions per room (exp1, exp2, roomExp); Approach 2 evaluates only two (exp1, exp2).

When N is large (for example, thousands of rooms in the document) but M is small (for example, 10 requested room IDs), this difference is significant.

For small maps (tens of entries), Approach 1 is simpler and the performance difference is negligible. Approach 2 pays off when the map is large relative to the number of keys you are selecting.

Approach 3: Native key selection (8.1.2+)

Use CTX.mapKeysIn to select room IDs directly in the path context, with CTX.andFilter for additional filtering. The key lookup happens natively inside the path expression, giving the cleanest API and best performance. Requires server 8.1.2+.

long[] roomIdArray = roomIds.stream().mapToLong(Long::longValue).toArray();
Operation op = CdtOperation.selectByPath("doc",
SelectFlags.MATCHING_TREE | SelectFlags.NO_FAIL,
CTX.mapKeysIn(roomIdArray),
CTX.andFilter(Exp.and(exp1, exp2)),
CTX.allChildren(),
CTX.allChildrenWithFilter(exp3));
Record record = client.operate(null, key, op);

Approach 3 improves on Approach 2 in two ways:

  • No expression pre-filter step: Approach 2 must first call MapExp.getByKeyList inside an expression wrapper (CdtExp.selectByPath + ExpOperation.read) to extract the M matching keys before the path traversal can begin. The server builds, evaluates, and tears down that intermediate expression on every call. Approach 3 replaces all of this with CTX.mapKeysIn, which performs the same key selection natively inside the path context, eliminating the expression overhead entirely.
  • Simpler API: Approach 3 uses CdtOperation.selectByPath — the same straightforward operation-based API as Approach 1 — instead of the expression-based wrapping that Approach 2 requires. The code is shorter, easier to read, and follows the same pattern developers already know.

In Approach 3 there is no reason to use CdtExp.selectByPath: CTX.mapKeysIn already handles key selection inside the path context, so the expression wrapper would add complexity and CPU cost for no benefit.

CTX.andFilter applies additional filters at the same level as the preceding context. Note that CTX.andFilter cannot be chained after another CTX.andFilter and cannot follow a CTX.allChildrenWithFilter. For full usage constraints, see key selection and combined filtering.

Result

All three approaches return the same result. Only room 10001 passes all room-level filters (10002 is deleted, 10003 has an old timestamp). Within that room, only the first rate (beta=3.0) passes the rate-level filter; the second rate (beta=0.0) is excluded.

{
"10001": {
"time": 1795647328,
"isDeleted": false,
"e": 4,
"rates": [
{ "a": 2, "Id": 1, "c": true, "d": "data", "beta": 3.0 }
]
}
}

General performance characteristics

Path expressions execute server-side, which reduces network transfer by returning only matching elements. For deeply nested structures, this can significantly reduce latency compared to fetching entire records and filtering client-side.

Performance depends on:

  • CDT size: Larger Maps/Lists require more processing time.
  • Nesting depth: Deeper structures take longer to traverse.
  • Filter complexity: Complex boolean expressions add overhead.
  • Result size: Returning MATCHING_TREE transfers more data than MAP_KEY.

For performance-critical applications, benchmark with realistic data volumes.

Batch inlining with large records

Path expression workloads typically involve nested CDT documents well above 1KiB per record. When you batch these operations, the default batch policy allowInline=true serializes the entire sub-batch on a single service thread for in-memory namespaces. For records larger than about 1KiB this is slower than non-inlined execution, which distributes the sub-batch across multiple service threads.

Set allowInline to false (or the language-equivalent batch policy flag) when batching path expression operations against an in-memory namespace whose records exceed about 1KiB. For SSD-based namespaces the default allowInlineSSD=false already disables inlining.

See Inlining batches for full details on batch inline policies and when inlining helps versus hurts.

Limits

Maximum nesting depth: Path expressions support up to 15 levels of CDT nesting, matching the Aerospike CDT context depth limit.

Element count: There is no hard limit on the number of elements, but very large CDTs (millions of elements) may impact latency. If you are working with extremely large collections, consider partitioning data across multiple records or using secondary indexes to narrow the scope before applying path expressions.

Feedback

Was this page helpful?

What type of feedback are you giving?

What would you like us to know?

+Capture screenshot

Can we reach out to you?