This notebook shares how Aerospike facilitates working with map data,
covering the following topics:
Ordering
Index & Rank
Nested Structures (subcontexts)
The above Aerospike Map capabilities provide significant utility through
providing easy and precise control and access to map data. This notebook
shares how to incorporate these strengths and best practices, and use
Maps as a powerful modeling tool.
This Jupyter
Notebook
requires the Aerospike Database running locally with Java kernel and
Aerospike Java Client. To create a Docker container that satisfies the
requirements and holds a copy of these notebooks, visit the Aerospike
Notebooks
Repo.
The default cluster location for the Docker container is localhost
port 3000. If your cluster is not running on your local machine,
modify localhost and 3000 to the values for your Aerospike cluster.
Aerospike Provides Powerful Resources for Working with Document-Oriented Data
Aerospike is a real-time data platform architected to store
Document-Oriented Data efficiently at scale. Rather than a traditional
KVS approach of blindly storing blobs in the database and sorting the
data in the application, Aerospike provides rich Map and List
(Collection Data Type) APIs for operating on Aerospike Records. The
result is that rather than spending an outsized time packing, unpacking,
and transporting data to and from the database, significant performance
efficiencies are gained by working with Document-Oriented Data on the
server-side.
Apply Key-Order or Key/Value-Order to Maps
The default order for Aerospike Maps is unordered. The best practice is
to use an ordered map, either Key-ordered (K-ordered) or
Key/Value-ordered (KV-ordered):
If the application reads data only by-key, use K-ordered.
If the application reads data frequently by either by-value or
by-rank operations, use KV-ordered.
Worst case Map Operation
Performance
highlight that the benefits of operating on a pre-sorted list are
significant.
Ordering Example
Add map keys (b=0, z=2, c=9, a=1, yy=1) to Bins containing unordered,
K-ordered, and KV-ordered maps.
System.out.println("The unordered map is: "+outMaps.getValue(unorderedMapBinName));
System.out.println("The k-ordered map is: "+outMaps.getValue(kOrderedMapBinName));
System.out.println("The kv-ordered map is also: "+outMaps.getValue(kvOrderedMapBinName));
Output:
The unordered map is: {yy=1, a=1, b=0, z=2, c=9}
The k-ordered map is: {a=1, b=0, c=9, yy=1, z=2}
The kv-ordered map is also: {a=1, b=0, c=9, yy=1, z=2}
Note: As demonstrated above, using unordered Maps in Aerospike will
not preserve insertion order. If insertion order is relevant to the
application, consider the following options:
Appending Maps to an Unordered List
Storing insertion order or a timestamp-like field in your Map
Map Index and Rank
In Aerospike, Map Index operations provide data in the key order.
Map Rank operations provides data in order of the value. Aerospike
provides a methodical order for maps, the following are factors that
impact rank:
Higher number of elements in the Map means higher rank.
For maps with the same number of elements, compare the KV-sorted
list.
System.out.println("The data is: "+listOfMaps.getValue(unorderedListBinName));
Output:
The data is: [{z=26}, {a=1, b=2}, {a=1, b=2, c=3, e=5}, {b=2, c=3}, {b=2, c=3}, {a=1}]
Note: This was explicitly written long form to not hide any important
knowledge in Java code complexity. Most developers would create a Java
TreeMap and use putItems to put the map in Aerospike.
System.out.println("The first element by index in the 3rd map in the list is:"+indexAndRankResults.get(0));
System.out.println("The maps in order from highest to lowest rank is: "+indexAndRankResults.get(1));
Output:
The first element by index in the 3rd map in the list is:[a=1]
The maps in order from highest to lowest rank is: [{a=1, b=2, c=3, e=5}, {b=2, c=3}, {b=2, c=3}, {a=1, b=2}, {z=26}, {a=1}]
Distinguishing Maps from Bins
It is important to highlight how an Aerospike Map (in a Bin) differs
from a Bin.
Unique Properties of Aerospike Bins
Bins are architected with the following design constraints:
Starting with Aerospike Database 7.0, there is no limit on the number of unique bin
names per namespace.
In Database 5.0 to 6.4 a namespace can
contain a maximum of 64k-1 unique bin names.
In Database 5.0 and earlier a namespace can contain
a maximum of 32k-1 unique bin names.
A record can contain up to 32k-1 bins.
Bin names are limited to 15 characters and are stored unencoded.
Bins have higher metadata overhead than Maps.
Unique Properties of Maps
Maps were architected for the flexibility needed from the data type.
Storage Efficiency
By comparison, Aerospike Maps use MessagePack
Serialization, to compress and index a
map’s keys and values. This makes storing and working with large maps
quite efficient.
Setting Context to Operations
Aerospike Database supports arbitrarily deep nesting within Container
Data Types (CDTs), Lists and Maps. As an application adds data to a Map
in Aerospike, the application also creates indexes and sub-indexes, which
allow operations to supply an operation with the precise context of the
data to be operated on. By understanding the nested structure of a Map,
an application can efficiently apply operations to the appropriate
context within a Map and send only the relevant parts of a Map across
the wire back to the client.
Bins or Maps: Best Practice for Modeling
Based on the above constraints, the best practices for long term
Aerospike use are:
When storing data in bins, use and reuse fewer, shorter,
consistent bin names.
Use Maps with arbitrary nesting widely.
Map Index, Rank, and Context Example
A credit card user can have multiple credit cards. This is modeled as:
System.out.println("The Credit Card data is: "+getCardMap.getValue(kOrderedMapBinName));
Output:
The Credit Card data is: {cards=[{cvv=111, default=1, expires=202201, last_six=511111, zip=95008}]}
Note: This was explicitly written long form to not hide any knowledge
in Java code complexity. Most developers would create a Java TreeMap and
use putItems to put the map in Aerospike.