Performance
This page covers performance considerations for path expressions, including optimization strategies for common patterns like IN-list filtering.
Optimizing IN-list filters on map keys
Aerospike 8.1.2 introduces CTX.mapKeysIn for selecting a subset of map keys directly
within a path expression, equivalent to WHERE key IN (k1, k2, ...) in SQL. This context
can be combined with CTX.andFilter to apply additional filters to the selected entries.
In the 8.1.1 preview, there was no native IN-list context, so developers had to work around this using expression primitives and loop variables. The example below shows three progressively more efficient ways to express the same IN-list query. Each produces the same result, but they differ in API clarity and performance as the map grows.
Booking example
This example uses a hotel booking document where each record is a map
keyed by room IDs (longs). Each room entry contains rates, metadata,
and status flags. The document is stored in a single map bin named "doc".
Data structure
{ "10001": { "rates": [ { "Id": 1, "a": 2, "beta": 3.0, "c": true, "d": "data" }, { "Id": 2, "a": 2, "beta": 0.0, "c": true, "d": "data blob" } ], "e": 4, "isDeleted": false, "time": 1795647328 }, "10002": { "rates": [ { "Id": 1, "a": 2, "beta": 3.0, "c": true, "d": "data blob" } ], "e": 4, "isDeleted": true, "time": 1795647328 }, "10003": { "rates": [ { "Id": 1, "a": 2, "beta": 3.0, "c": true, "d": "data blob" } ], "e": 4, "isDeleted": false, "time": 1764140184 }}Filtering dimensions
doc (bin, map keyed by long room IDs) └── 10001 / 10002 / 10003 ├── rates (list of maps) <-- rate-level filter │ ├── { Id:1, a:2, beta:3.0, c:true, d:"data" } │ └── { Id:2, a:2, beta:0.0, c:true, d:"data blob" } ├── e: 4 ├── isDeleted: false <-- room-level filter └── time: 1795647328 <-- room-level filter- Top level (IN-list): select specific room IDs from the map keys (e.g.,
roomId IN (10001, 10003)) - Room level: filter on
isDeleted(boolean) andtime(epoch timestamp threshold) - Rate level: filter on
beta > 0.0, specific rateId, etc.
Query goal
Select only rooms whose ID is in a given list, that are not deleted,
have a recent timestamp, and whose rates have beta > 0.
The SQL-like equivalent of this filter expression would be
WHERE roomIds IN [10001, 10003] AND time > 1780000000AND isDeleted IS false AND beta > 0.0Expected result: only room 10001 passes all filters (10002 is deleted, 10003 has old timestamp),
with only the first rate (beta=3.0); the second rate (beta=0.0) is excluded.
Shared filter expressions
List<Long> roomIds = Arrays.asList(10001L, 10003L);long timeThreshold = 1780000000L;
Exp exp1 = Exp.gt( MapExp.getByKey(MapReturnType.VALUE, Exp.Type.INT, Exp.val("time"), Exp.mapLoopVar(LoopVarPart.VALUE)), Exp.val(timeThreshold));
Exp exp2 = Exp.eq( MapExp.getByKey(MapReturnType.VALUE, Exp.Type.BOOL, Exp.val("isDeleted"), Exp.mapLoopVar(LoopVarPart.VALUE)), Exp.val(false));
Exp exp3 = Exp.gt( MapExp.getByKey(MapReturnType.VALUE, Exp.Type.FLOAT, Exp.val("beta"), Exp.mapLoopVar(LoopVarPart.VALUE)), Exp.val(0.0));In the analysis below, N is the total number of entries in the map and M is the number of requested keys.
Approach 1: Filter-as-you-go (8.1.1+)
Check membership inside allChildrenWithFilter using ListExp.getByValue.
This traverses every room in the map and, for each one, linearly scans the roomIds list.
Exp roomExp = Exp.gt( ListExp.getByValue(ListReturnType.COUNT, Exp.intLoopVar(LoopVarPart.MAP_KEY), Exp.val(roomIds)), Exp.val(0));
Operation op = CdtOperation.selectByPath("doc", SelectFlags.MATCHING_TREE | SelectFlags.NO_FAIL, CTX.allChildrenWithFilter(Exp.and(exp1, exp2, roomExp)), CTX.allChildren(), CTX.allChildrenWithFilter(exp3));
Record record = client.operate(null, key, op);allChildrenWithFilter(Exp.and(exp1, exp2, roomExp)):
- Evaluates every room entry in the map — all N rooms.
- For each room, evaluates the full filter including
roomExp, which callsListExp.getByValue(COUNT, currentMapKey, roomIdsList)— a linear scan of the M-elementroomIdslist. - All three filter expressions (
exp1,exp2,roomExp) are evaluated for every room, even those not in the target list.
Complexity: O(N × M) — N rooms visited, each with an O(M) membership scan.
Approach 2: Pre-filter then traverse (8.1.1+, optimized)
The bottleneck in Approach 1 is that every room is visited regardless of whether it is in
the target list. Approach 2 eliminates this by extracting only the matching rooms before
the path expression runs: MapExp.getByKeyList performs an index-based lookup on the map,
then CdtExp.selectByPath applies the remaining filters to the reduced result.
This requires switching from CdtOperation.selectByPath to the expression-based
CdtExp.selectByPath wrapped in ExpOperation.read, because the operation-based API
does not accept a pre-filtered map as input.
Expression readExp = Exp.build( CdtExp.selectByPath(Exp.Type.MAP, SelectFlags.MATCHING_TREE | SelectFlags.NO_FAIL, MapExp.getByKeyList(MapReturnType.ORDERED_MAP, Exp.val(roomIds), Exp.mapBin("doc")), CTX.allChildrenWithFilter(Exp.and(exp1, exp2)), CTX.allChildren(), CTX.allChildrenWithFilter(exp3)));
Operation op = ExpOperation.read("doc", readExp, ExpReadFlags.DEFAULT);Record record = client.operate(null, key, op);MapExp.getByKeyListuses the map’s index to look up each of the M requested keys — O(M log N) for an ordered map.- The path expression then operates on the M matching rooms, not all N.
- The remaining filters (
exp1,exp2) are evaluated on only M rooms, and theroomExpmembership check is gone entirely — the pre-filter already guarantees only matching rooms are present.
Complexity: O(M log N) for the key lookup, plus O(M) for the downstream path traversal — dominated by the lookup step.
Compared to Approach 1:
- Index-based lookup vs brute-force scan:
getByKeyListpulls M keys directly via the index. Approach 1 visits every room and linearly scans theroomIdslist for each one. - Smaller working set: Every subsequent CTX level (
allChildren,allChildrenWithFilter(exp3)) operates on M rooms instead of N. For a map with 10,000 rooms and 10 requested IDs, this means roughly 1,000× less downstream work. - Eliminated membership check: Approach 1 evaluates three filter expressions
per room (
exp1,exp2,roomExp); Approach 2 evaluates only two (exp1,exp2).
When N is large (for example, thousands of rooms in the document) but M is small (for example, 10 requested room IDs), this difference is significant.
For small maps (tens of entries), Approach 1 is simpler and the performance difference is negligible. Approach 2 pays off when the map is large relative to the number of keys you are selecting.
Approach 3: Native key selection (8.1.2+)
Use CTX.mapKeysIn to select room IDs directly in the path context, with CTX.andFilter
for additional filtering. The key lookup happens natively inside the path expression,
giving the cleanest API and best performance. Requires server 8.1.2+.
long[] roomIdArray = roomIds.stream().mapToLong(Long::longValue).toArray();
Operation op = CdtOperation.selectByPath("doc", SelectFlags.MATCHING_TREE | SelectFlags.NO_FAIL, CTX.mapKeysIn(roomIdArray), CTX.andFilter(Exp.and(exp1, exp2)), CTX.allChildren(), CTX.allChildrenWithFilter(exp3));
Record record = client.operate(null, key, op);Approach 3 improves on Approach 2 in two ways:
- No expression pre-filter step: Approach 2 must first call
MapExp.getByKeyListinside an expression wrapper (CdtExp.selectByPath+ExpOperation.read) to extract the M matching keys before the path traversal can begin. The server builds, evaluates, and tears down that intermediate expression on every call. Approach 3 replaces all of this withCTX.mapKeysIn, which performs the same key selection natively inside the path context, eliminating the expression overhead entirely. - Simpler API: Approach 3 uses
CdtOperation.selectByPath— the same straightforward operation-based API as Approach 1 — instead of the expression-based wrapping that Approach 2 requires. The code is shorter, easier to read, and follows the same pattern developers already know.
In Approach 3 there is no reason to use CdtExp.selectByPath:
CTX.mapKeysIn already handles key selection inside the path
context, so the expression wrapper would add complexity and CPU cost for no benefit.
CTX.andFilter applies additional filters at the same level as the preceding
context. Note that CTX.andFilter cannot be chained after another CTX.andFilter
and cannot follow a CTX.allChildrenWithFilter. For full usage constraints, see
key selection and combined filtering.
Result
All three approaches return the same result. Only room 10001 passes all room-level filters
(10002 is deleted, 10003 has an old timestamp). Within that room, only the first rate
(beta=3.0) passes the rate-level filter; the second rate (beta=0.0) is excluded.
{ "10001": { "time": 1795647328, "isDeleted": false, "e": 4, "rates": [ { "a": 2, "Id": 1, "c": true, "d": "data", "beta": 3.0 } ] }}General performance characteristics
Path expressions execute server-side, which reduces network transfer by returning only matching elements. For deeply nested structures, this can significantly reduce latency compared to fetching entire records and filtering client-side.
Performance depends on:
- CDT size: Larger Maps/Lists require more processing time.
- Nesting depth: Deeper structures take longer to traverse.
- Filter complexity: Complex boolean expressions add overhead.
- Result size: Returning
MATCHING_TREEtransfers more data thanMAP_KEY.
For performance-critical applications, benchmark with realistic data volumes.
Batch inlining with large records
Path expression workloads typically involve nested CDT documents well above 1KiB per
record. When you batch these operations, the default batch policy allowInline=true
serializes the entire sub-batch on a single service thread for
in-memory namespaces.
For records larger than about 1KiB this is slower than non-inlined execution, which
distributes the sub-batch across multiple service threads.
Set allowInline to false (or the language-equivalent batch policy flag) when
batching path expression operations against an in-memory namespace whose records
exceed about 1KiB. For SSD-based namespaces the default allowInlineSSD=false
already disables inlining.
See Inlining batches for full details on batch inline policies and when inlining helps versus hurts.
Limits
Maximum nesting depth: Path expressions support up to 15 levels of CDT nesting, matching the Aerospike CDT context depth limit.
Element count: There is no hard limit on the number of elements, but very large CDTs (millions of elements) may impact latency. If you are working with extremely large collections, consider partitioning data across multiple records or using secondary indexes to narrow the scope before applying path expressions.