Queries

For the complete documentation index see: llms.txt

All documentation pages available in markdown.

A query is a command applied against all the records of a namespace, or a specified set within it, that are indexed by its primary index (PI) or an optional secondary index (SI). Queries are used for situations where the exact records are not known, as opposed to single-record and batched commands, which access records by their keys.

Queries are sent by the client to each cluster node, where they are executed against the primary index or a secondary index.

Foreground queries

Similar to the SELECT statement in a relational database, a foreground query is read-only. As the query iterates through the namespace data partitions it streams the current version of each record to the client.

Characteristics

Selection of records:
- Queries, regardless of the index involved, can use filter expressions such as last_update() or set_name(). The filter acts as a WHERE clause.
- An optional secondary index filter can be applied by the query.
- Otherwise, the query targets the primary index. A PI query against a specified set name automatically leverages a set index, if one was created on this set.
Projection of record data:
- A list of named bins, also known as bin projection.
- operation projection of read-only bin-read operations, CDT read operations, and read expressions (which return their result in a computed bin). Available in Database 8.1.2 and later, matching the capability of single-record and batched operate.
- QueryPolicy.includeBinData controls whether to only return record metadata (digest, generation and TTL).
Query by data partitions: If no partition IDs are specified, the client automatically runs the query against all 4096. This capability can be leveraged to horizontally scale the processing of results from a large dataset in parallel between multiple clients.
Pagination: Return a specified number of records with the ability to continue the query from that point. The client uses a partition filter object to track progress across multiple partitions.
Migration-tolerant queries: clients compatible with Database 6.0.0 and later ensure that queries handle the automatic data rebalancing (migration) that happens after a permanent cluster size changes.

Background queries

A client application can issue an asynchronous background query to modify records in place on the server. This is similar to an UPDATE statement in a relational database.

Background queries apply either multiple native bin operations or a user-defined function (UDF) written in Lua to the records matched by the query. Using bin operations, also known as background ops, is more efficient and higher performing than using a Lua UDF, also known as a background UDF.

Characteristics

Selection of records:
- Background queries, regardless of the index involved, can use filter expressions such as last_update() or set_name(). The filter acts as a WHERE clause.
- An optional secondary index filter can be applied by the query.
- Otherwise, the query targets the primary index. A PI query against a specified set name automatically leverages a set index, if one was created on this set.
Clients can poll for progress and completion of a background query.

Background queries are not migration-tolerant and might miss records when a partition is migrating during data rebalancing.

Limiting query speed

Each individual query can be capped to run at a specified records per-second limit.

SREs can enforce that the totality of a user’s commands, including queries, are limited to a record per-second rate quota.

Query runtime optimization

A query’s runtime can be optimized by specifying its expected duration.

A query that is expected to run for a relatively long period of time is termed long query, and is the default expected duration. The runtime of a query depends on the complexity of its filter expression, the size of the dataset, indexing strategies, and the IOPS (input/output operations per second) capacity of the cluster nodes. Long queries include primary index queries running against a large dataset, or secondary index queries that return a large number of records.
A query that is expected to consistently run for a short duration and return a small number of records is termed short query. Explicitly designating a query’s expected duration as short allows Aerospike to optimize it for lower latency and a higher queries per-second (QPS) throughput.
The expected duration optimization works with queries running against both strong consistency (SC) and available and partition-tolerant (AP) namespaces.

Differences between short and long queries

The following table shows how queries are optimized to execute differently based on their expected duration.

Short Query	Long Query
Short queries have a default 1s timeout (on the socket)	Long queries do not time out once they’re running
Short query execution cannot be throttled	Long query execution can be throttled by setting an RPS (records per second) cap
Short queries cannot be aborted	Long queries can be aborted
Short queries are not tracked	Active long queries are tracked and statistics on the most recent completed queries are kept in a queue (of size `query-max-done`)
Short queries are measured by query latency and record count histograms	Long queries do not provide latency and record count histograms
A short query runs in a single query thread, enabling a higher number of QPS	A long query runs in the number of query threads defined by `single-query-threads`
Short queries can be inlined to run in service threads	Long queries only run in query threads

Long queries

In a long query there are typically lots of records to read in each data partition. The client begins by reserving and querying full partitions, and defers other partitions (for examples ones that are migrating) to a subsequent round, by which time the migration of these partitions to another cluster node is expected to have completed. If a deferred partition still isn’t full at the time the client tries to reserve it, it will retry several (by default 5) more times. Setting the query policy with adequate sleep between retries is more insurance that migrations do not time out and return an error code 11 PARTITION_UNAVAILABLE.

Long queries against an AP namespace that don’t return many records might error if they fail to reserve a migrating partition, because even moving it to the end of the querying sequence might still catch it while it’s migrating. For this reason, Database 7.1.0 moved from a boolean QueryPolicy.shortQuery to a QueryPolicy.expectedDuration with three options - short, long, and “relaxed AP long query”.

Following proper operating procedure, such as waiting for migrations to complete during a rolling upgrade, both avoids query errors and missing records. Under this approach, a relaxed long query against an AP namespace doesn’t miss any records, same as a long query against an SC namespace. Both settings avoid partition unavailable errors during properly executed rolling upgrade.

Short queries

Short queries don’t run for long enough to take advantage of the strategy governing long queries, where the client defers querying migrating partitions to the end. For this reason, short queries against an AP namespace are always relaxed, allowing the client to reserve a partition that is not full.

Short queries against SC namespaces will fail if they can’t reserve all the (full) partitions.

Queries

Foreground queries

Characteristics

Background queries

Characteristics

Limiting query speed

Query runtime optimization

Differences between short and long queries

Long queries

Short queries

Related documentation