Map performance
As with any index, there is a tradeoff between performance and storage. Persisting a map index consumes additional space on disk, but eliminates the CPU cost of rebuilding it on every access. This page helps developers and SREs balance CPU impact with storage capacity. When storage is available, choosing ordered map types and persisting the map index can significantly reduce CPU usage.
The following 𝓞 run time performance analysis is worst-case.
| Operation | Unordered (1) | Ordered (2) | Persisted Offset Index (3) | Persisted Full Index (4) |
|---|---|---|---|---|
| Set Type Ops | ||||
set_type to Unordered | 1 | 1 | 1 | 1 |
set_type to K-Ordered | N log N | 1 | 1 | 1 |
set_type to KV-Ordered | N log N | N log N | N log N | 1 |
| Modify Ops | ||||
put | N | N | log N | log N |
put_items | (N+M)log M + N | M(log M + log N) + N | M(log M + log N) + N | M(log M + log N) + N |
incrementdecrement | N | N | log N | log N |
clear | 1 | 1 | 1 | 1 |
remove_by_key | N | N | log N | log N |
remove_by_index | L log N + N | N | 1 | 1 |
remove_by_rank | R log N + N | R log N + N | R log N | 1 |
remove_by_key_interval | N + M | N + M | log N + M | log N + M |
remove_by_index_range | L log N + N | N + M | M | M |
remove_by_value_intervalremove_all_by_value | N + M | N + M | N + M | log N + M |
remove_by_rank_range | R log N + N | R log N + N | R log N | M |
remove_all_by_key_list | (M+N)logM + N | M log N + N | M log N | M log N |
remove_all_by_value_list | (M+N)logM + N | (M+N)log M + N | (M+N)log M | M log N |
| Read Ops | ||||
size | 1 | 1 | 1 | 1 |
get_by_key | N | N | log N | log N |
get_by_index | L log N + N | N | 1 | 1 |
get_by_index_range | L log N + N | N + M | M | M |
get_by_rank | R log N + N | R log N + N | R log N | 1 |
get_by_key_interval | N | N | log N + M | log N + M |
get_by_value_intervalget_all_by_value | N + M | N + M | N + M | log N + M |
get_by_rank_range | R log N + N | R log N + N | R log N | M |
get_all_by_key_list | (M+N)logM + N | M log N + N | M log N | M log N |
get_all_by_value_list | (M+N)logM + N | (M+N)log M + N | (M+N)log M | M log N |
| Mode Modifiers | ||||
| Storage | +D + W | |||
Legend
1: Unordered map. Key lookup is a linear scan. The offset index is built temporarily per operation and discarded afterwards.2: Ordered map (K-Ordered or KV-Ordered) without a persisted index. The map maintains key order and uses binary search for key lookups, but the offset index is built temporarily per operation and discarded afterwards.3: Ordered map with a persisted offset index. Any map subtype with the persist-index flag (and without theV_ORDEREDflag) persists the key offset index in the map particle, eliminating the per-operation O(N) offset index build cost. This includes K-ordered + persist and even unordered + persist (which sorts the keys at write time; subsequent operations use binary search and the persisted offset index).4: Ordered map with a persisted full index. Any map subtype that includes theV_ORDEREDflag with the persist-index flag persists both the key offset index and the value order index in the map particle. The value order index enables O(1) rank lookups and O(log N) value-based searches.D: Cost of loading the record from storage.W: Cost of writing the record to storage.N: Count of elements in the Map.M: Count of elements in the range/interval or elements added to the Map.R: for rank r, R = min(r, N - r - 1).L: for index i, L = min(i, N - i - 1).C: Cost of memcpy on packed Map.
Every modify op has a +C for copy on write for allowing rollback on failure.
Notes
- Column (2) applies to maps created with an ordered map policy (K-Ordered or KV-Ordered) but without the persist-index flag. These maps maintain key order but do not persist the offset index.
- KV-ordered without a persisted index has no performance advantage over K-ordered without a persisted index for rank and value operations. Both use the same heap/scan fallback. The only advantage of KV-ordered without persist is that converting to KV-ordered with persist is O(1) if elements are already in value order.
- Without a persisted index, the key offset index is built temporarily for each operation and discarded afterwards. This adds an O(N) cost per operation, which is why columns (1) and (2) show higher complexity than columns (3) and (4) for many operations.
- With a persisted offset index (column 3), the key offset index is stored in the map particle. This eliminates the per-operation O(N) build cost, enabling O(log N) key lookups and O(1) index-based access.
- With a persisted full index (column 4), a value order index is additionally stored in the map particle. This enables O(1) rank lookups and O(log N) value-based searches, which are not available with only a persisted offset index.
- The interval-based operations have parameters [start, stop) where the start value is inclusive and stop value is exclusive. They are generally faster than range-based operations when there is no persisted index metadata.
- The range-based operations with parameters (rank, count) or (index, count) are very fast for ordered maps with a persisted index.
- Reminder: worst-case, average-case and best-case performance are different categories. You should compare like to like.
Return type performance implications
The complexities above assume the default return type. Requesting a non-default return type (such as INDEX, RANK, or ORDERED_MAP) adds extra CPU cost. See the Performance section of each map operation for specifics.
- Key-based operations (suffixes
by_index,by_index_range, and the wordkey) generally incur additional cost when returningRANKresults. They are sped up by having a persisted offset index (column 3), and further by having a persisted full index (column 4) for rank return types. - Value-based operations (suffixes
by_rank,by_rank_range, and the wordvalue) generally incur additional cost when returningINDEXresults. They are sped up by having a persisted full index (column 4). - The
ORDERED_MAPreturn type carries the extra cost of ordering the result when the source map is unordered. There is no extra cost when the map is already K-Ordered or KV-Ordered.
Performance analysis example
Consider a storage model where the map key is a name and the map value is the score associated with the name.
map: {name1: score1, name2: score2, ...}
Let’s say you want to optimize performance for getting the names with the top 10 scores. The operation for getting the top 10 would be get_by_rank_range.
KV-Ordered with persisted full index
From the complexities table, one can see choosing KV-Ordered with a persisted full index (4) for get_by_rank_range results in O(M) performance. The persisted value order index enables direct rank-to-element lookup. Since we’re getting top 10, the performance is O(10) or constant time.
There is an additional +D cost for loading the record from storage.
Performance: O(1) + D
Return type extra cost
Looking at get_by_rank_range again, if the return type was INDEX, there would be extra performance cost as noted in the operation reference.
For an ordered map with a persisted offset index (3), the base performance of get_by_rank_range is O(R log N). Requesting an INDEX return type adds O(M), giving a total of O(R log N + M).