Queries
A query is a command applied against all the records of a namespace, or a specified set within it, that are indexed by its primary index (PI) or an optional secondary index (SI). Queries are used for situations where the exact records are not known, as opposed to single-record and batched commands, which access records by their keys.
Query features
-
Queries are sent by the client to each cluster node, where they are executed against the primary index or a secondary index. The resulting records are streamed back to the client.
-
Queries can be filtered. An optional expression can be applied at the server to filter out records found in the primary index or matched by a secondary index filter.
-
Queries are partitioned. If no partition IDs are specified, the client automatically runs the query against all 4096 data partitions. This capability can be leveraged to horizontally scale the processing of results from a large dataset in parallel between multiple clients.
-
Queries can be paginated, using a client partition filter object to track progress across multiple partitions.
-
Queries are partition-tolerant. Clients compatible with Database 6.0 and later ensure that queries handle the automatic partition rebalancing that happens after a permanent cluster size change.
Query runtime optimization
A query’s runtime can be optimized by specifying its expected duration.
-
A query that is expected to run for a relatively long period of time is termed long query, and is the default expected duration. The runtime of a query depends on the complexity of its filter expression, the size of the dataset, indexing strategies, and the IOPS (input/output operations per second) capacity of the cluster nodes. Long queries include primary index queries running against a large dataset, or secondary index queries that return a large number of records.
-
A query that is expected to consistently run for a short duration and return a small number of records is termed short query. Explicitly designating a query’s expected duration as short allows Aerospike to optimize it for lower latency and a higher queries per-second (QPS) throughput.
-
The expected duration optimization works with queries running against both strong consistency (SC) and available and partition-tolerant (AP) namespaces.
Differences between short and long queries
The following table shows how queries are optimized to execute differently based on their expected duration.
Short Query | Long Query |
---|---|
Short queries have a default 1s timeout (on the socket) | Long queries do not time out once they’re running |
Short query execution cannot be throttled | Long query execution can be throttled by setting an RPS (records per second) cap |
Short queries cannot be aborted | Long queries can be aborted |
Short queries are not tracked | Active long queries are tracked and statistics on the most recent completed queries are kept in a queue (of size query-max-done ) |
Short queries are measured by query latency and record count histograms | Long queries do not provide latency and record count histograms |
A short query runs in a single query thread, enabling a higher number of QPS | A long query runs in the number of query threads defined by single-query-threads |
Short queries can be inlined to run in service threads | Long queries only run in query threads |
Long queries
In a long query there are typically lots of records to read in each data partition. The client begins by reserving and querying full partitions, and defers other partitions (for examples ones that are migrating) to a subsequent round, by which time the migration of these partitions to another cluster node is expected to have completed. If a deferred partition still isn’t full at the time the client tries to reserve it, it will retry several (by default 5) more times. Setting the query policy with adequate sleep between retries is more insurance that migrations do not time out and return an error code 11 PARTITION_UNAVAILABLE
.
Long queries against an AP namespace that don’t return many records might error if they fail to reserve a migrating partition, because even moving it to the end of the querying sequence might still catch it while it’s migrating. For this reason, Database 7.1 moved from a boolean QueryPolicy.shortQuery
to a QueryPolicy.expectedDuration
with three options - short, long, and “relaxed AP long query”.
Following proper operating procedure, such as waiting for migrations to complete during a rolling upgrade, both avoids query errors and missing records. Under this approach, a relaxed long query against an AP namespace doesn’t miss any records, same as a long query against an SC namespace. Both settings avoid partition unavailable errors during properly executed rolling upgrade.
Short queries
Short queries don’t run for long enough to take advantage of the strategy governing long queries, where the client defers querying migrating partitions to the end. For this reason, short queries against an AP namespace are always relaxed, allowing the client to reserve a partition that is not full.
Short queries against SC namespaces will fail if they can’t reserve all the (full) partitions.