Querying Records
Prior to Database 6.0, primary index (PI) queries were called scans and secondary index (SI) queries were called queries.
Query All Records in a Set
The following is a primary index (PI) query that queries all records from a specific namespace and set:
SELECT * FROM <ns>[.<set>]
Where:
<ns>
is the namespace.<set>
is the set name.
Example:
aql> SELECT * FROM users.profiles
+---------------------------+-----+--------+
| name | age | gender |
+---------------------------+-----+--------+
| "Bob White" | 22 | "M" |
| "Annie Black" | 28 | "F" |
| "Sally Green" | 19 | "F" |
| "Ricky Brown" | 20 | "M" |
| "Tammy Argent" | 22 | "F" |
+---------------------------+-----+--------+
5 rows in set (0.000 secs)
Starting with Tools 3.8.2, SELECT *
prints the primary key of a record in a column 'PK' if it was sent to the server when the record was written.
Project-Specific Bins
The following is the syntax for a PI query to project bins from all records in a specific namespace and set.
SELECT <bin>[, <bin>[, ...]] FROM <ns>[.<set>]
Where
<ns>
is the namespace.<set>
is the set name.<bin>
are one or more bins to project from the records.
Example:
aql> SELECT name, age FROM users.profiles
+---------------------------+-----+
| name | age |
+---------------------------+-----+
| "Bob White" | 22 |
| "Annie Black" | 28 |
| "Sally Green" | 19 |
| "Ricky Brown" | 20 |
| "Tammy Argent" | 22 |
+---------------------------+-----+
5 rows in set (0.000 secs)
Filter on Indexed Bins
Starting with Tools 8.3.0, AQL's select command accepts a double where clause with two predicates. Starting with Tools 8.3.0, AQL's select command can limit the number of records displayed using the 'limit' specifier. Starting with Tools 3.8.2, for GeoJSON, both equality and range predicates are supported by using 'CONTAINS' and 'WITHIN'.
This is the syntax for a filtering all records in a specific namespace and set based on a predicate:
SELECT <bin>[, <bin>[, ...]] FROM <ns>[.<set>] [IN indextype] [WHERE <predicate>] [AND <predicate2>] [LIMIT <limit>]
To reduce the number of records returned use the LIMIT <limit>
specifier. This can be useful for instances where you would like to sample an entire namespace or set.
Example:
SELECT * from users.profiles LIMIT 100
The index type specifier IN indextype
is required in case the secondary index is on LIST, MAPKEYS, or MAPVALUES. If not specified, defaults to index on basic index on the bin itself.
The <predicate>
must be one of the supported predicates for the secondary index. For NUMERIC indexes, either a range predicate or equality predicate can be applied. For STRING indexes, only equality predicates are supported. When using a double where clause both predicates, <predicate>
and <predicate2>
, must be equality operations. Furthermore, one of the bins in either predicate must have a sindex defined.
Assuming age
has a NUMERIC secondary index defined, a valid SELECT
command would be:
SELECT name, age, city FROM users.profiles WHERE age 20 and city = "San Francisco"
A predicate can be in the following forms depending on the datatype:
# For Numeric and String, this states that the bin `<bin>` must be equal to `<value>`.
<bin> = <value>
# For Numeric, this states that the bin `<bin>` must fall with then range between `<lower>` and `<upper>` (inclusive).
<bin> BETWEEN <lower> AND <upper>
# For GeoJSON, this returns value containing specific location points.
<bin> CONTAINS <GeoJSONPoint>
# For GeoJSON, this returns values within a specified range of points.
<bin> WITHIN <GeoJSONPolygon>
Example of a GeoJSON range SI query:
SELECT * FROM test.demo WHERE gj CONTAINS CAST('{"type": "Point", "coordinates": [0.0, 0.0]}' AS GEOJSON)
An equality predicate is in the form which states that the bin <bin>
must equal <value>
.
<bin> = <value>
Or, if attempting to get a record for a specific primary key:
PK = <key>
In the next example, we assume we have a secondary index on the age bin, and want to run an SI query to identify all people between the age of 20 and 29:
aql> SELECT name, age FROM users.profiles WHERE age BETWEEN 20 AND 29
+---------------------------+-----+
| name | age |
+---------------------------+-----+
| "Bob White" | 22 |
| "Annie Black" | 28 |
| "Ricky Brown" | 20 |
| "Tammy Argent" | 22 |
+---------------------------+-----+
4 rows in set (0.000 secs)
This SI query assumes a secondary index was created using:
aql> CREATE INDEX user_age_idx ON users.profiles (age) NUMERIC
Aggregating on SI Query or PI query Results
This command performs an aggregation on SI query results:
AGGREGATE <module>.<function>([<arg>[,...]]) ON <ns>[.<set>] [IN indextype] WHERE <predicate>
Remove WHERE <predicate>
to run aggregation on Primary Index (PI) query results:
AGGREGATE <module>.<function>([<arg>[,...]]) ON <ns>[.<set>]
This specifies a Stream UDF to execute against the results of an SI query.
Where
<module>
is the UDF module name.<function>
is the Stream UDF function name.<arg>
are the arguments for the UDF function. Multiple arguments can be specified.<predicate>
is the same predicate used for querying.
In the following example, the SI query has an aggregation stream applied to it called "avg_age". "avg_age" has the appropriate sub-functions required to define the different stages of the stream (function "female", "name_age", and "eldest"). This set of functions are included in the UDF module named "profile_aggregator":
aql> AGGREGATE profile_aggregator.avg_age() ON users.profiles WHERE age BETWEEN 20 and 29
+--------------------------------------+
+ avg_age |
+--------------------------------------+
+ { "name": "Annie Black", "age": 28 } |
+--------------------------------------+
Assuming the UDF module named "profile_aggregator" was registered containing an "avg_age" function like:
function avg_age(stream)
local function female(rec)
return rec.gender == "F"
end
local function name_age(rec)
return map{ name=rec.name, age=rec.age }
end
local function eldest(p1, p2)
if p1.age > p2.age then
return p1
else
return p2
end
end
return stream : filter(female) : map(name_age) : reduce(eldest)
end
The stream operations defined:
- Filters female users.
- Converts each profile record into a map containing only a name and age.
- Reduces each profile to eldest female.