Visit booth 3171 at Google Cloud Next to see how to unlock real-time decisions at scaleMore info
Aerospike Interactive Tutorial: Advanced Collection Data Types
import Terms from ’../../shared/_terms.part.mdx’;
For an interactive Jupyter notebook experience:
Last updated: June 22, 2021
The goal of this tutorial is to highlight the power of working with
collection data types
(CDTs) in
Aerospike. It covers the following topics:
Setting contexts
(CTXs)
to apply operations to nested Maps and Lists.
Showing the return type options provided by CDT get/read commands.
Highlighting how policies shape commands sent by the application.
This Jupyter Notebook
requires the Aerospike Database running locally with Java kernel and
Aerospike Java Client. To create a Docker container that satisfies the
requirements and holds a copy of these notebooks, visit the Aerospike Notebooks Repo.
Prerequisites
This Notebook builds on the material in the following notebooks:
It uses examples based on those from Modeling Using Lists and Working with Maps. If any of the
following is confusing, please refer to a relevant notebook.
Ask Maven to download and install the project object model (POM) of the
Aerospike Java Client.
%%loadFromPOM
<dependencies>
<dependency>
<groupId>com.aerospike</groupId>
<artifactId>aerospike-client</artifactId>
<version>5.0.0</version>
</dependency>
</dependencies>
Start the Aerospike Java Client and Connect
Create an instance of the Aerospike Java Client, and connect to the demo
cluster.
The default cluster location for the Docker container is localhost
port 3000. If your cluster is not running on your local machine,
modify localhost and 3000 to the values for your Aerospike cluster.
The primary use case of Key-Value Stores, like Aerospike Database, is to
store document-oriented data, like a JSON map. As document-oriented data
grows organically, it is common for one CDT (list or map) to contain
another CDT. Does the application need a list in a map in a list in a
map? Aerospike fully supports nesting CDTs, so that’s no problem.
What is a Context?
A Context (CTX) is a reference to a nested CDT, a List or Map that is
stored in a List or Map somewhere in an Aerospike Bin. All
List
and Map Operations
accept an optional CTX argument. Any CTX argument must refer to data of
the type supported by the operation.
The most common ways to access a CTX are to look up a Map CTX directly
by its key within the Bin and to drill down within a List or Map by
index, rank or value. A CTX can also be created within a List or Map.
For more details, see the CTX APIs.
Look up a Map CTX in a Bin by Mapkey
Use the mapKey method to look up a CTX in a Map directly by mapkey.
This works for a Map anywhere in a Bin.
The following is an example of finding a Map CTX in a Bin by Mapkey:
System.out.println("Before, the whale migration list was: "+theRecord.getValue(listWhaleBinName)+"\n");
System.out.println("After the addition, it is:"+postCreate.getValue(listWhaleBinName)+"\n\n");
System.out.println("Before, the observation map was: "+theRecord.getValue(mapObsBinName)+"\n");
System.out.println("After the addition, it is: "+postCreate.getValue(mapObsBinName));
Output:
Before, the whale migration list was: [[1420, beluga whale, Beaufort Sea, Bering Sea], [13988, gray whale, Baja California, Chukchi Sea], [1278, north pacific right whale, Japan, Sea of Okhotsk], [5100, humpback whale, Columbia, Antarctic Peninsula], [3100, southern hemisphere blue whale, Corcovado Gulf, The Galapagos]]
After the addition, it is:[[1420, beluga whale, Beaufort Sea, Bering Sea], [13988, gray whale, Baja California, Chukchi Sea], [1278, north pacific right whale, Japan, Sea of Okhotsk], [5100, humpback whale, Columbia, Antarctic Peninsula], [3100, southern hemisphere blue whale, Corcovado Gulf, The Galapagos], [1449, sei whale, Greenland, Gulf of Maine]]
After the addition, it is: {12345={lat=-85, long=-130}, 13456={lat=-25, long=-50}, 14567={lat=35, long=30}, 15678={lat=95, long=110}}
Choosing the Return Type Options for CDTs
Operations on CDTs can return different types of data, depending on the
return type value specified. A return type can be combined with the
INVERTED flag to return all data from the CDT that was not selected by
the operation. The following are the Return Types for Lists
and Maps.
Standard Return Type Options for CDTs
Aerospike Lists and Maps both provide the following return type options.
COUNT: Return count of items selected.
INDEX: Return index offset order.
NONE: Do not return a result.
RANK: Return value order. If the list/map is not ordered,
Aerospike will JIT-sort the list/map.
REVERSE_INDEX: Return reverse index offset order.
REVERSE_RANK: Return value order from a version of the list sorted
from maximum to minimum value. If the list is not ordered, Aerospike
will JIT-sort the list.
VALUE: Return value for single item read and list of values from a
range read.
All indexes are 0-based, with the last element accessible by index -1.
The following is an example demonstrating each possible return type from
the same operation.
The current whale migration list is: [[1420, beluga whale, Beaufort Sea, Bering Sea], [13988, gray whale, Baja California, Chukchi Sea], [1278, north pacific right whale, Japan, Sea of Okhotsk], [5100, humpback whale, Columbia, Antarctic Peninsula], [3100, southern hemisphere blue whale, Corcovado Gulf, The Galapagos], [1449, sei whale, Greenland, Gulf of Maine]]
For the whales who migrate between 1400 and 3500 miles...
Return COUNT: 3
Return INDEX: [0, 4, 5]
Return NONE: has no return value.
Return RANK: [1, 2, 3]
Return REVERSE_INDEX: [5, 1, 0]
Return REVERSE_RANK: [2, 3, 4]
Return Values: [[1420, beluga whale, Beaufort Sea, Bering Sea], [3100, southern hemisphere blue whale, Corcovado Gulf, The Galapagos], [1449, sei whale, Greenland, Gulf of Maine]]
Additional Return Type Options for Maps
Because Maps have a replicable key/value structure, Aerospike provides
options to return mapkeys or key/value pairs, in addition to value.
KEY: Return key for single key read and key list for range read.
KEY_VALUE: Return key/value pairs for items.
The following is an example demonstrating returning a key or key/value
pair.
The current whale observations map is: {12345={lat=-85, long=-130}, 13456={lat=-25, long=-50}, 14567={lat=35, long=30}, 15678={lat=95, long=110}}
For the most recent observation...
Return the key: 15678
Return key/value pair: [15678={lat=95, long=110}]
Invert the Operation Results for CDT Operations
Aerospike also provides the INVERTED flag for CDT operations. When
INVERTED is “logical or”-ed to the return type, the flag instructs a
list or map operation to return the return type data for list or Map
elements that were not selected by the operation. This flag instructs an
operation to act as though a logical NOT operator was applied to the
entire operation.
The following is an example demonstrating inverted return values.
The current whale migration list is: [[1420, beluga whale, Beaufort Sea, Bering Sea], [13988, gray whale, Baja California, Chukchi Sea], [1278, north pacific right whale, Japan, Sea of Okhotsk], [5100, humpback whale, Columbia, Antarctic Peninsula], [3100, southern hemisphere blue whale, Corcovado Gulf, The Galapagos], [1449, sei whale, Greenland, Gulf of Maine]]
For the whales who migrate between 1400 and 3500 miles...
Return INVERTED COUNT: 3
Return INVERTED INDEX: [1, 2, 3]
Return INVERTED NONE: has no return value.
Return INVERTED RANK: [0, 4, 5]
Return INVERTED REVERSE_INDEX: [4, 3, 2]
Return INVERTED REVERSE_RANK: [5, 0, 1]
Return INVERTED Values: [[13988, gray whale, Baja California, Chukchi Sea], [1278, north pacific right whale, Japan, Sea of Okhotsk], [5100, humpback whale, Columbia, Antarctic Peninsula]]
Highlighting how policies shape commands set by the application
Each data type operation has a write policy which can be set per CDT
write/put operation to optionally:
Just-in-time sort the data being operated on.
Apply flags that instruct Aerospike’s transaction write behavior.
Create and set a MapPolicy or ListPolicy with the proper sort and write
flags to change how Aerospike processes a command.
MapOrder and ListOrder, Just-in-time Sorting for an Operation
By default, Maps and Lists are stored unordered. There are explicit
techniques to store a list or map in order. The Map data in this
notebook is key sorted. Please refer to the code snippet creating the
map data (above) for an example of this. There are examples of ordering
lists in the notebook Modeling Using Lists.
Applying a MapOrder or ListOrder has performance implications on
operation performance. This can be a reason to apply a MapOrder or
ListOrder when working with data. To understand the relative worst-case
time complexity of Aerospike operations go here for
lists
and here for
maps.
Whether to allow duplicates in a list is a function of ListOrder.
Note: Aerospike finds that worst-case performance can be helpful in
determining how to prioritize application use-cases against one another,
but do not set realistic performance expectations for Aerospike
Database. An example where they help is asking tough questions, like,
“the worst case time complexity for operation A is X, is operation A
important enough to do daily or just monthly in light of the other
workloads that are more time sensitive?”
Write Flags
The following are lists of write flags for
Lists
and
Maps.
Beneath each are example commands.
A powerful use case for Aerospike is to group operations together into
single-record commands using the Operate method. This
technique is used above in this notebook. When applying commands to
data, there are common circumstances where:
All possible operations should be executed in a fault tolerant
manner
Specific operation failure should cause all operations to fail
Write flags can be used in any combination, as appropriate to the
application and Aerospike operation being applied.
Write Flags for all CDTs
DEFAULT
For Lists, allow duplicate values and insertions at any index.
For Maps, allow map create or updates.
NO_FAIL: Do not raise an error if a CDT item is denied due to
write flag constraints.
PARTIAL: Allow other valid CDT items to be committed if a CDT item
is denied due to write flag constraints.
These flags provide fault tolerance to commands. Apply some
combination of the above three flags–DEFAULT, NO_FAIL, and
PARTIAL–to operations by using “logical or” as demonstrated below. All
other write flags set conditions for operations.
Note: Without NO_FAIL, operations that fail due to the below
policies will throw either error code 24 or
26.
Default Examples
All of the above code snippets use a Default write flag policy. These
operations are unrestricted by write policies.
No Fail Examples
All of the examples in the following sections show both an exception
caused by a write flag, and then pair the demonstrated write flag with
No Fail to show how the same operation can fail silently.
Partial flag example
Partial is generally used only in a command containing operations
using the No Fail write flag. Otherwise, the command would contain
no failures to overlook. The following examples are list and map
commands combining both failing and successful map and list
operations.
// create policy to apply and data to trigger operation failure
System.out.println("Without Add Unique here, the tuple for a sei whale is there 2x: "+noAUData.getValue(listWhaleBinName));
Output:
Data after the unique add of [1449, sei whale, Greenland, Gulf of Maine]: [[1420, beluga whale, Beaufort Sea, Bering Sea], [13988, gray whale, Baja California, Chukchi Sea], [1278, north pacific right whale, Japan, Sea of Okhotsk], [5100, humpback whale, Columbia, Antarctic Peninsula], [3100, southern hemisphere blue whale, Corcovado Gulf, The Galapagos], [1449, sei whale, Greenland, Gulf of Maine]]
Non-Unique Add 1: Exception caught.
Non-Unique Add 2: No operation was executed. Error was suppressed by NO_FAIL.
Without Add Unique here, the tuple for a sei whale is there 2x: [[1420, beluga whale, Beaufort Sea, Bering Sea], [13988, gray whale, Baja California, Chukchi Sea], [1278, north pacific right whale, Japan, Sea of Okhotsk], [5100, humpback whale, Columbia, Antarctic Peninsula], [3100, southern hemisphere blue whale, Corcovado Gulf, The Galapagos], [1449, sei whale, Greenland, Gulf of Maine], [1449, sei whale, Greenland, Gulf of Maine]]
Write Flags for Maps Only:
CREATE_ONLY: If the key already exists, the item will be denied.
UPDATE_ONLY: If the key already exists, the item will be
overwritten. If the key does not exist, the item will be denied.
Created record and new key 15678. The data is now: {12345={lat=-85, long=-130}, 13456={lat=-25, long=-50}, 14567={lat=35, long=30}, 15678={lat=95, long=110}}
Update attempt 1: Exception caught.
Update attempt 2: No operation was executed. Error was suppressed by NO_FAIL.
Without Create Only, the observation at 15678 is overwritten: {12345={lat=-85, long=-130}, 13456={lat=-25, long=-50}, 14567={lat=35, long=30}, 15678={lat=0, long=0}}
System.out.println("Using update only, the value of an existing key "+ existingObsKey +" can be updated: "+uoSuccessData.getValue(mapObsBinName)+"\n");
Created record: {12345={lat=-85, long=-130}, 13456={lat=-25, long=-50}, 14567={lat=35, long=30}}
Create Attempt 1: Exception caught.
Create Attempt 2: No operation was executed. Error was suppressed by NO_FAIL.
Using update only, the value of an existing key 13456 can be updated: {12345={lat=-85, long=-130}, 13456={lat=0, long=0}, 14567={lat=35, long=30}, 15678={lat=95, long=110}}