Secondary index operations
Secondary indexes (SIs) enable efficient queries on non-primary key fields, but they come with trade-offs in memory, performance, and operational complexity.
How do secondary indexes impact capacity?
Secondary indexes consume system resources according to storage location:
- Shared Memory (shmem)
- Persistent Memory (pmem)
- Flash (SSD)
See the Capacity planning guide for full storage configuration details.
How can secondary indexes affect performance?
When a secondary index is defined:
- Updates to existing records force a read on the replica to retrieve the old bin value and maintain the index.
- Replace operations trigger extra reads on both the master and replica, increasing pressure on the storage layer.
Do I need to adjust garbage collection?
Secondary index entries must be cleaned up when records are deleted or removed due to data migration. Garbage collection (GC) behavior depends on the storage location and the event that triggers it.
Event Type | GC Triggered | Notes |
---|---|---|
Delete | Yes* | If primary index is on flash, record is read inline to clean secondary index |
Truncate | Yes | Secondary index entries are always checked |
Partition Drop | Yes | GC always triggered |
Can I use application-level indexing instead?
Instead of native secondary indexes, you can:
- Model with lookup records for index functionality.
- Leverage the set index feature, when relevant, along with expression filters.
- Experiment with the use of the void time (indirectly, through judiciously setting the TTL).
- Leverage transactions to help keep ‘manual indexing’ consistent and provide:
- Greater control over performance
- Lower memory/disk overhead
- Better suitability for high-write or predictable-query workloads
Best practices summary
- Use native secondary indexes judiciously
- Prefer app-managed indexing for high-write workloads, where the extra read on the replica side affects performance, or for low-cardinality fields
- Be aware of the following:
- Replace operation overhead
- Garbage collection cost when using a flash-based primary index
- Capacity planning according to index storage location (see Secondary index capacity planning)