Primary index
Prior to Server 6.0, primary index (PI) queries were called scans and had policies defined through the scan policy. See Queries for more information.
Jump to the Code block for a combined complete example.
Basic PI queries have the following features:
- Filter records by set name.
- Filter records by filter expressions.
- Limit the number of records returned, useful for pagination.
- Return only record digests and metadata (generation and TTL).
- Return specified bins.
Setup
The following examples will use the setup and record structure below to illustrate primary index queries in an Aerospike database.
import aerospikefrom aerospike_helpers import expressions as expfrom aerospike_helpers.operations import operations
# Define host configurationconfig = { 'hosts': [ ('127.0.0.1', 3000) ]}# Establishes a connection to the serverclient = aerospike.client(config)The record structure:
Occurred: IntegerReported: IntegerPosted: IntegerReport: Map{ shape: List, summary: String, city: String, state: String, duration: String}Location: GeoJSONPolicies
See Basic Queries for query policy information.
Query a set
The following example queries the sandbox namespace and ufodata set name, while limiting the record set to only 20 records.
# Create the queryquery = client.query('sandbox', 'ufodata')
# Set max records to returnquery.max_records = 20
# Create callback functiondef record_set(record): (key, meta, bins) = record # Do something print('Key: {0} | Record: {1}'.format(key[2], bins))
# Execute the queryquery.foreach(record_set)
# Close the connection to the serverclient.close()Query with a metadata filter
The following example queries the same namespace and set as the example above, but also adds a metadata Filter Expression that will only return records that are greater than 16 KiB.
# Build the expressionexpr = exp.GT(exp.DeviceSize(), (1024 * 16)).compile()
# Create the policyquery_policy = {'expressions': expr}
# Create the queryquery = client.query('sandbox', 'ufodata')
# Create callback functiondef record_set(record): (key, meta, bins) = record # Do something print('Key: {0} | Record: {1}'.format(key[2], bins))
# Execute the queryquery.foreach(record_set, policy=query_policy)
# Close the connection to the serverclient.close()Query with a data filter
The following example queries the same namespace and set as the example above, but also adds a data Filter Expression that
will only return records where the occurred bin value is in the inclusive range 20200101 to 20211231.
Take a look at secondary index queries to see how this same query can be run more efficiently with an index.
# Build the expressionexpr = exp.Let( exp.Def('bin', exp.IntBin('occurred')), exp.And( exp.GE(exp.Var('bin'), 20210101), exp.LE(exp.Var('bin'), 20211231))).compile()
# Create the policyquery_policy = {'expressions': expr}
# Create the queryquery = client.query('sandbox', 'ufodata')
# Create callback functiondef record_set(record): (key, meta, bins) = record # Do something print('Key: {0} | Record: {1}'.format(key[2], bins))
# Execute the queryquery.foreach(record_set, policy=query_policy)
# Close the connection to the serverclient.close()Pagination
Pagination uses a combination of a partition filter and a defined maximum records to return query results. The partition filter maintains a cursor identifying the end of the current page and the beginning of the next. Moving to the next page of results is as simple as executing the query again, with the previously defined partition filter.
Defining a maximum number of records per page to return guarantees that no page will contain more than the maximum number, but some pages may contain fewer than the maximum number. Also, if you run the same paginated query multiple times, the number of results per page may differ, depending on the order in which they are delivered by the nodes in the cluster.
The following example executes a query with an Expression Filter identifying records with more than 3 shape items in the report map,
returning 10 records per page. The partition filter is set to query all 4096 partitions in the database.
# Build the expressionexpr = exp.GT( exp.ListSize( None, exp.MapGetByKey(None, aerospike.MAP_RETURN_VALUE, exp.ResultType.LIST, 'shape', exp.MapBin('report')) ), 3).compile()
# Create the policyquery_policy = {'expressions': expr}
# Create the queryquery = client.query('sandbox', 'ufodata')
# Set max recordsquery.max_records = 10
# Set paginationquery.paginate()
# Execute the querypage = 0while not query.is_done(): count = 0 records = query.results(query_policy) for record in records: (key, meta, bins) = record # Do something print('Key: {0} | Record: {1}'.format(key[2], bins)) count += 1 page += 1 print('Page {0} | {1} records'.format(page, count))
# Close the connection to the serverclient.close()Code block
Expand this section for a single code block to execute a basic PI query
import aerospikefrom aerospike_helpers import expressions as expfrom aerospike_helpers.operations import operations
# Define host configurationconfig = { 'hosts': [ ('127.0.0.1', 3000) ]}# Establishes a connection to the serverclient = aerospike.client(config)
# Build the expressionexpr = exp.Let( exp.Def('bin', exp.IntBin('occurred')), exp.And( exp.GE(exp.Var('bin'), 20210101), exp.LE(exp.Var('bin'), 20211231))).compile()
# Create the policyquery_policy = {'expressions': expr}
# Create the queryquery = client.query('sandbox', 'ufodata')
# Create callback functiondef record_set(record): (key, meta, bins) = record # Do something print('Key: {0} | Record: {1}'.format(key[2], bins))
# Execute the queryquery.foreach(record_set, policy=query_policy)
# Close the connection to the serverclient.close()