Skip to content

Leveraging collections

Lists and Maps are the building blocks you use to consolidate data into well-sized records. Instead of spreading each child entity across separate records, each costing 64 bytes of primary index overhead, you pack related data into collection data type bins and operate on individual elements in place.

This page covers the modeling patterns that Lists and Maps enable. For the full API reference of each operation, see the List operations and Map operations pages.

Map subtypes and ordering

Starting with Aerospike Database 7.0, all Maps are stored in key order on the server regardless of the order hint the client uses when creating them. Maps have three subtypes that differ in which internal indexes they maintain. All three share the same API and support key, index, value, and rank operations. You can convert between subtypes with set_type().

  • Unordered — no internal indexes. All lookups scan elements. Lowest storage overhead.
  • K-ordered — maintains a key offset index that maps each key position to a byte offset within the packed map.
  • KV-ordered — maintains both the key offset index and a value order index that maps rank to element.

Two access dimensions are available on every Map:

  • Index is the position in key order (0-based, negative from end). Index 0 is the entry with the smallest key; index -1 is the largest.
  • Rank is the position in value order (0 = smallest value). When two entries have the same value, the tie is broken by key order: the entry with the lower key-order position gets the lower rank.

Starting with Aerospike Database 7.1, Map keys are restricted to integer, string, or blob types.

Persisting internal indexes

By default, the server rebuilds a Map’s internal indexes for each operation and discards them afterwards. Setting PERSIST_INDEX in the map policy stores the indexes on disk inside the map particle, so subsequent operations load them directly instead of rebuilding.

There are two persistence levels:

  • Persisted offset index (PERSIST_INDEX without V_ORDERED) — eliminates the per-operation O(N) index rebuild cost. Key lookups become O(log N) and index-based access becomes O(1).
  • Persisted full index (PERSIST_INDEX with V_ORDERED) — persists both the key offset and value order indexes. Rank lookups become O(1) and value-based searches become O(log N), in addition to the offset index benefits.

Persist-index is supported only for top-level Maps. The server silently ignores the flag on nested Maps. For Maps accessed frequently or holding thousands of entries, PERSIST_INDEX can reduce operation latency significantly.

For the full performance characteristics of each subtype and persistence level, see the Map performance page.

Composite-key technique

When you encode the desired sort dimension directly in the map key, key order becomes your display order, and index-based operations serve as rank operations without additional storage.

A leaderboard Map keyed by zero-padded score-playerId strings illustrates this:

{"00150-alice": {...}, "00320-bob": {...}, "00320-carol": {...}, "00780-dave": {...}}

The entry at index 0 is the lowest score. The entry at index -1 is the highest. The player ID portion breaks ties deterministically. To get the top 50 scores, call get_by_index_range with index -50 and count 50 — a single CDT operation on one record, with no secondary structure.

To break ties by time instead of player ID, extend the key to score-timestamp-playerId. The key format controls the sort and the tie-breaking, and you define it once at write time.

For large Maps using this technique, PERSIST_INDEX makes get_by_index_range O(M) instead of O(N), where M is the number of returned entries and N is the total element count. A “top 50” query on a Map with 10,000 entries becomes constant-time relative to the map size.

Ordered lists with ADD_UNIQUE

When you need a set-like collection — a list of unique values with efficient membership checks — use an ordered List with the ADD_UNIQUE write flag.

On an ordered List, ADD_UNIQUE uses binary search to check for duplicates: O(log n). On an unordered List, the same flag falls back to a linear scan: O(n). For relationship lists with thousands of entries (follower IDs, tag collections, reference lists), the difference is significant.

Combine ADD_UNIQUE with the NO_FAIL write flag when you want the write to succeed silently if the value already exists. This lets you treat the operation as an idempotent “ensure present” without error handling for duplicates.

For ordered Lists that grow into the thousands of entries, set persistIndex in the list policy to store the offset index on disk. Without persist, the offset index is rebuilt per operation — an O(N) walk through packed elements. With persist, ADD_UNIQUE inserts stay at O(log N) and paginated reads with list_get_by_index_range become O(M) for M returned elements instead of O(N + M). In an ordered List, index and rank are equivalent, so rank-based operations also benefit. Persist-index is supported only for top-level Lists; the server silently ignores it on nested Lists.

Value-based access with list tuples

Maps let you look up entries by key, but they do not support wildcard matching or interval queries on values. When you need value-range or wildcard-based selection, consider storing structured data as list tuples instead of map entries.

A list of [timestamp, deviceId, reading] tuples supports queries like “all readings where the first element is between T1 and T2” using get_by_value_interval with list-to-list bounds. The same query on map values would require iterating all entries.

The trade-off is that tuples are positional: element 0 is the timestamp, element 1 is the device ID, and so on. Adding or reordering fields requires migrating existing data. For structures that change shape over time, named fields in a Map or the list-of-structs pattern (described later on this page) give safer evolution.

Nested context

You operate on elements deep inside a collection by providing a context path — a stack of selectors that navigate from the bin to the target element. Each selector drills one level deeper:

  • BY_LIST_INDEX(n) / BY_MAP_KEY(k) — select by position or key.
  • BY_LIST_RANK(r) / BY_MAP_RANK(r) — select by value order.
  • BY_LIST_VALUE(v) / BY_MAP_VALUE(v) — select by value match.
  • MAP_KEY_CREATE(k) / LIST_INDEX_CREATE(n) — select or create if missing (Aerospike Database 4.9+).

For example, to increment a counter inside a nested Map structure like {stats: {accolades: {jokes: 317}}}, you provide a context of [MAP_KEY_CREATE("stats"), MAP_KEY_CREATE("accolades")] and then call map_increment on key "jokes". The CREATE variants ensure the path exists even on the first write, eliminating the need for a separate initialization step.

The maximum nesting depth for context operations is 15 levels. In practice, most models stay within 2–4 levels. If you find yourself reaching deeper, that is often a signal to flatten the structure or split it across records.

For the full context API and multi-language examples, see Context for operations on nested elements.

Multiple sort dimensions

A Map’s key order gives you one native sort dimension, and rank gives you a second (value order). When you need additional sort dimensions — for example, a leaderboard sorted by score that also supports “most recent entries” queries — you maintain auxiliary sorted structures alongside the primary one.

A common pattern: the primary Map bin is keyed by score-playerId for rank access, and a second bin holds an ordered List of [timestamp, playerId] tuples for recency queries. Both bins update atomically in a single operate() call, so they stay consistent.

The cost is write amplification: every insert or update touches multiple bins. The benefit is that each read pattern resolves to a single CDT operation on the appropriate bin, with no client-side sorting or post-processing.

List-of-structs with path expressions

Path expressions let you treat a List of Maps as a collection of typed structures with named fields. Each Map in the list represents one entity (a notification, a line item, a sensor reading), and path expression operations select or modify entries based on field values. Path expressions require Aerospike Database 8.1.2 or later. (Database 8.1.1 introduced preview support; 8.1.2 adds mapKeysIn and andFilter context types and is the production prerequisite.)

A notification timeline illustrates the pattern. Each user has one record per day, keyed notif:{userId}:{YYYY-MM-DD}, with a bin items holding a List of Maps:

[
{"type": "like", "from": "bob", "read": false, "ts_ms": 1711929600000},
{"type": "comment", "from": "carol", "read": true, "ts_ms": 1711933200000},
{"type": "follow", "from": "dave", "read": false, "ts_ms": 1711936800000}
]

With path expressions you can:

  • Select all unread notifications with a single selectByPath call that filters on read == false.
  • Mark a specific notification as read with modifyByPath, targeting the entry where from == "carol" and type == "comment".
  • Remove all notifications from a blocked user with modifyByPath and a filter on the from field.

These operations execute server-side in one round trip, without transferring the full list to the client. The named fields (type, from, read, ts_ms) make the structure self-describing and evolvable — adding a new field does not require migrating existing entries.

Baseline CDT operations still handle the common cases: list_append to add new notifications, list_get_by_index_range to paginate, and list_size to get the count. Path expressions extend the model with field-level filtering and mutation when you need them.

Starting with Database 8.1.2, you can also use mapKeysIn to select map entries by key — equivalent to SQL WHERE key IN (k1, k2, ...) — and andFilter to apply an additional filter at the same context level. Both use the map’s internal index for efficient lookup.

For the full path expression API, worked examples, and performance guidance, see the path expressions section.

Choosing between Lists and Maps

The choice depends on how you access the data:

Access patternPreferred CDTWhy
Look up by unique keyMapDirect key lookup: O(log n)
Maintain a ranked or sorted collectionMap (composite key)Key order = sort order
Deduplicated set of valuesOrdered List + ADD_UNIQUEBinary search dedup: O(log n)
Value-range or wildcard queriesList of tuplesget_by_value_interval with bounds
Key-based access with occasional value/rank queriesMap + persisted full indexO(log N) key and value access, O(1) rank
Append-heavy, ordered by insertionUnordered ListAppend is O(1)
Named-field structures with filteringList of Maps + path expressionsField-level select/modify (DB 8.1.2+)

When the access pattern is primarily key-based with occasional value-range or rank queries, a Map with a persisted full index (PERSIST_INDEX + V_ORDERED) may be preferable to list tuples. The persisted full index gives you O(log N) value-based access and O(1) rank lookups — comparable to or better than an ordered List with persisted index for those operations — while retaining the self-describing key-value structure.

When neither a pure List nor a pure Map fits, combine them: a Map bin for the primary lookup path and a List bin for a secondary access pattern, updated atomically in a single operate() call.

Equality comparison caveat

Only ordered Maps (K-ordered or KV-ordered) can be reliably compared for equality. Unordered Maps have no canonical byte ordering, so two unordered Maps with the same logical content can have different wire representations. Comparisons involving unordered Maps — whether through an expression eq operator or through *_by_value operations — may return false even when the Maps contain the same elements.

This matters when you store a List of Maps and use ADD_UNIQUE for deduplication. If the Map elements are unordered, ADD_UNIQUE may fail to detect duplicates because the byte representations differ. When you use ADD_UNIQUE on a List of Maps, make sure the Map elements are K-ordered or KV-ordered so that comparison works correctly.

Sizing collections

The same sizing principles from Record sizing apply to collections within a record. Estimate the element count at p99 and multiply by the average element size to get the expected bin size.

A Map bin with 1,000 entries of 50 bytes each is roughly 50 KiB — well within the 1–128 KiB sweet spot. A Map bin with 100,000 entries of 50 bytes each is roughly 5 MiB — functional but approaching the territory where the overflow strategies described in Relationships become relevant.

When a collection grows without bound (relationship edges, event logs, accumulated readings), plan for the eventual size, not the launch size. Time-slice the key to cap per-record growth (see Key design), or use the shard-on-demand pattern to distribute overflow across sub-records.

Feedback

Was this page helpful?

What type of feedback are you giving?

What would you like us to know?

+Capture screenshot

Can we reach out to you?