Aerospike Queries in Python
For an interactive Jupyter notebook experience:
Intoduction to Aerospike queries in Python.
This notebook
requires Aerospike database running on localhost and that python and the
Aerospike python client have been installed (pip install aerospike
).
Visit Aerospike notebooks
repo for
additional details and the docker container.
Ensure database is running
This notebook requires that Aerospike database is running.
!asd >& /dev/null
!pgrep -x asd >/dev/null && echo "Aerospike database is running!" || echo "**Aerospike database is not running!**"
Output:
Aerospike database is running!
Connect to database and populate test data
The test data has ten records with user-key "id1-10", two bins (fields) "name" and "age", in the namespace "test" and set "demo".
# import the module
from __future__ import print_function
import aerospike
# Configure the client
config = {
'hosts': [ ('127.0.0.1', 3000) ],
'policy' : {'key': aerospike.POLICY_KEY_SEND}
}
# Create a client and connect it to the cluster
try:
client = aerospike.client(config).connect()
except:
import sys
print("failed to connect to the cluster with", config['hosts'])
sys.exit(1)
# Records are addressable via a tuple of (namespace, set, key)
people = [ {'id':1, 'name':'John Doe', 'age': 53},
{'id':2, 'name':'Brian Yu', 'age': 21},
{'id':3, 'name':'Will Kim', 'age': 34},
{'id':4, 'name':'Dorothy Smith', 'age': 48},
{'id':5, 'name':'Sara Poe', 'age': 29},
{'id':6, 'name':'Kim Knott', 'age': 56},
{'id':7, 'name':'Joe Miller', 'age': 30},
{'id':8, 'name':'Jeff Nye', 'age': 32},
{'id':9, 'name':'Jane Doe', 'age': 44},
{'id':10, 'name':'Emily Tuck', 'age': 22} ]
try:
for i in range(10):
# Write the records
client.put(('test', 'demo', 'id'+str(people[i]['id'])), people[i])
except Exception as e:
import sys
print("error: {0}".format(e), file=sys.stderr)
print('Test data populated.')
Output:
Test data populated.
Create secondary index
To use the query API, a secondary index must exist on the query field. We will create an integer secondary index on the "age" bin.
# Must create an index to query on a bin
from aerospike import exception as ex
try:
client.index_integer_create("test", "demo", "age", "test_demo_number_idx")
except ex.IndexFoundError:
pass
print('Secondary index created.')
Output:
Secondary index created.
Querying with secondary indexes
In addition to retrieving records with the primary index using the key-value store APIs, the Aerospike Python client provides an API to query records using secondary indexes. To use the query API, a secondary index must exist on the query field.
Use the Query APIs to query the database using secondary indexes.
Create a query
The API client.query() takes the namespace (required) and set (optional) arguments. The parameter set can be omitted or None, in which case records in the namespace that are outside any set are returned. The return value is a new aerospike.Query class instance.
This example creates a query on the test namespace, demo set.
query = client.query('test', 'demo')
print('Query object created.')
Output:
Query object created.
Project bins
Project (or select) bins using select() on the Query class instance. The select() API accepts one or many bin names (strings).
This example selects "name" and "age" bins from the specified records.
query.select('name', 'age')
print('Bins name and age selected.')
Output:
Bins name and age selected.
Add query predicate
Define predicates using the where() API on the Query class instance. The where() API accepts a predicate created using one of the functions in aerospike.predicates including:
- equals(bin, value) — Find records containing the bin with the specified value (integer or string).
- between(bin, min, max) — Find records containing the bin with a value in the min and max range (integer only).
This example adds the between() predicate to a query.
from aerospike import predicates as p
query.where( p.between('age', 14, 25) )
print('Predicate defined.')
Output:
Predicate defined.
Define foreach function
In order to executer the query and read the results, we need to use the foreach() API in the Query class instance. The foreach() API accepts a callback function for each record read from the query. The callback function must accept a single argument as a tuple:
- key tuple — The tuple to identify the record.
- metadata — The dict containing the record metadata (TTL and generation).
- record — The dict containing the record bins.
If the callback returns False, the client stops reading results.
This examples executes the query and prints results as they are read.
To print the records as they are read, we define a print_result function.
def print_result(result_tuple):
print(result_tuple)
print('Foreach function defined.')
Output:
Foreach function defined.
Execute query and foreach
Now we are ready to execute the query by passing in the print_result that will be called for each record. Based on the data we populated earlier, we expect 2 results between ages 14 and 25.
print("Executing query and printing results:")
query.foreach(print_result)
Output:
Executing query and printing results:
(('test', 'demo', None, bytearray(b'\xb2\x13X\x1dI\xd8\xba`\xab\x96\xa2\xf0\xd9\x8b\x19\xf9DZug')), {'ttl': 2591998, 'gen': 1}, {'name': 'Brian Yu', 'age': 21})
(('test', 'demo', None, bytearray(b'\x0bR\xbc\xa1\x02`SF?\x01\xe7\xd3`\x8d[F\xcb\xd71V')), {'ttl': 2591998, 'gen': 1}, {'name': 'Emily Tuck', 'age': 22})
Explore other query capabilities
Please feel free to play with the "equals" predicate, adding secondary indexes on other fields, populating more test data to the "null" set and querying those records, and so on.
Clean up
# Close the connection to the Aerospike cluster
client.close()
print('Connection closed.')
Output:
Connection closed.
Next steps
Visit Aerospike notebooks repo to run additional Aerospike notebooks. To run a different notebook, download the notebook from the repo to your local machine, and then click on File > Open, and select Upload.