How to retrieve the key in a key-value store
Learn how to retrieve user-specified keys in Aerospike using single record reads.
In this blog, you’ll learn how to retrieve the user-specified key in Aerospike, a key-value store.
Aerospike is designed from the ground up to support extremely low latency create, read, update, and delete (CRUD) operations. In Aerospike, applications store data as individual records that are uniformly distributed across all the nodes of the database cluster. A user-specified key is associated with each record along with the application data. Below, we will walk you through the steps necessary to retrieve your user-specified key in a single record read operation.
Keys in Aerospike
As mentioned above, Aerospike stores user data in individual records, which are accessible by a user-specified "key." Keys can only be a string, integer, or a byte array.
[USER SPECIFIED KEY] → [USER DATA in one or more "bins"]
Applications access the data via the Aerospike client library, which computes a 20-byte “digest” by hashing the user-supplied key information of set name, user key value, and user key type.
The client library sends the 20-byte digest as the key of record to the server. So, the record in the server should read:
[20 Byte Digest] → [USER DATA in one or more "bins"]
Storing keys in records
What if two keys hash to the same digest value, resulting in a hash collision?
Aerospike uses the RIPEMD-160 hashing algorithm, which has a statistically near-zero probability of hash collision. By default, Aerospike does not store the user-supplied key with the saved record. However, applications have the option to store the user-specified key along with the record via the write policy.
[20 Byte Digest ] → [USER DATA in one or more "bins"] + [User Key (string / int / blob)]
The approach above ensures that the server verifies the hash against the stored key during CRUD operations. Any subsequent updates where a hash collision is detected, the application will receive a KEY_MISMATCH
error.
Key mismatch explained
Consider the case in a namespace test
, where a user-specified key of abc
in set testSet1
generates a particular digest, digest1
. When this record is first created, Aerospike uses digest1
as the internal key of that record but also stores the user-specified key of abc
with the record.
Hypothetically, in the same namespace test
, consider the user-specified key of xyz
in set testSet2
, which also happens to hash to the same value, digest1
. (This has near zero probability, practically speaking.)
If the application tries to create or update a record with digest1
and with user-specified key xyz
, Aerospike will access the record stored against digest1
and compare the stored key abc
against the incoming transaction's user-specified key of xyz
and discover that they are different. A KEY_MISMATCH
exception will be thrown in this case.
User key stored with the record
The curious may want to know: Is it possible to read this user-specified stored key back in the application?
For data modeling purposes and for queries (primary or secondary index queries that retrieve qualifying records from the entire namespace), the user stored key can be retrieved back to the client. This is useful since the user may want to then update individual records selectively, as part of their use case.
However, for single key reads the user key is not returned to the client, as you must have had the key to be able to read the record in the first place. This type of read can be done by passing either the user key and the set name or the digest.
In certain operational situations, users have asked if it is possible to read back the user stored key on an individual record read. They stored their key with the record and would now like to retrieve it. These are situations where users have the 20-byte digest of a record, but they do not know the user key. The digest was recorded in the server logs. An example of this would be when turning on rw-client logging to identify the digests of "hot-keys," records that are being updated very frequently.
The read APIs don't provide this functionality upfront via the policies since this is not a typical data modeling need. However, Aerospike does provide expressions that can access the user stored key.
What are Aerospike Expressions?
An expression is a strongly typed domain-specific language designed for manipulating and comparing bins and record metadata. There is no “programming language” per se; an expression is built with code. It runs on the nodes in the cluster, not on the client.
Aerospike supports three types of expressions:
Filter expressions (introduced in version 5.2.0)
filter expressions (introduced in version 5.3.0)
Operation expressions (introduced in version 5.6.0)
Filter and operation expressions can be specified on a per-call basis.
Key retrieval code example
Below is an example code snippet demonstrating how to retrieve a key using the Java client.
Creating the record initially:
AerospikeClient client = new AerospikeClient("127.0.0.1", 3000);
Key key = new Key("test", "testset", "key1");
WritePolicy wPolicy = new WritePolicy();
wPolicy.sendKey = true; //Store our user key on the server
Bin b0 = new Bin("b00", Value.get(28));
client.put(wPolicy, key, b0);
Now, assume we just have the 20-byte digest and would like to retrieve our key:
We can use the following key constructor:
// Initialize key from namespace, digest, optional set name, and optional userKey.
Key(String namespace, byte[] digest, String setName, Value userKey)
Let's find the digest of the record we created above, for purposes of this demo.
byte[] recDigest = new byte[20];
recDigest = Key.computeDigest("testset", Value.get("key1")) ;
Next, create a new key object using the digest-based constructor.
Key keyByDigest = new Key("test", recDigest, null, Value.NULL);
About Value.NULL
The constructor allows specifying the user-specified key in the last argument as a “Value" type object. For example, Value.get("myKey")
. This is provided for the rare case where the application wants to do an update with the key generated using the digest, and the set name is not known but still wants to update with sendKey=true
. In our case, we don't know the key and are trying to retrieve it in an operate()
call, which is like a write, although we will only perform a read via expressions. Hence, we pass Value.NULL
for the user-specified key and use sendKey=false
.
We should be able to read the same record using either "key" or "keyDigest." Let's validate.
System.out.println("Record: "+ client.get(null, key));
System.out.println("Record via Digest: "+ client.get(null, keyByDigest));
Output:
Record: (gen:12),(exp:447740032),(bins:(b00:28))
Record via Digest: (gen:12),(exp:447740032),(bins:(b00:28))
Case 1: key type known
Now, using the digest-based key, let's retrieve the stored key from the server. Since we are specifying the digest and Value.NULL
for the userKey in our recDigest
constructor, we must set the wPolicy.sendKey = false
; to avoid a KEY_MISMATCH
error.
wPolicy.keySend = false;
Expression recKeyExp = Exp.build(Exp.key(Exp.Type.STRING));
Record record = client.operate( wPolicy, keyByDigest,
ExpOperation.read("reckey", recKeyExp, ExpReadFlags.DEFAULT)
);
System.out.println("Record Key: " + record.getValue("reckey"));
Recall we mentioned that user-specified keys can be of type integer, string, or byte-array. Here, although we don't know our key, we know it is a string type. This is specified using Exp.Type.STRING
when building the expression for the key
.
Output:
Record Key: key1
Case 2: key type unknown
What if we don't know the key type? We can check for all the three allowable types. Although, in this case, we must use ExpReadFlags.EVAL_NO_FAIL
instead of ExpReadFlags.DEFAULT
so that the failed expressions with type mismatch don't generate an Exception.
Example code is shown below.
wPolicy.sendKey = false;
Expression recKeyExpStr = Exp.build(Exp.key(Exp.Type.STRING));
Expression recKeyExpInt = Exp.build(Exp.key(Exp.Type.INT));
Expression recKeyExpBlob = Exp.build(Exp.key(Exp.Type.BLOB));
Record record = client.operate( wPolicy, keyByDigest,
ExpOperation.read("reckeyStr", recKeyExpStr, ExpReadFlags.EVAL_NO_FAIL),
ExpOperation.read("reckeyInt", recKeyExpInt, ExpReadFlags.EVAL_NO_FAIL),
ExpOperation.read("reckeyBlob", recKeyExpBlob, ExpReadFlags.EVAL_NO_FAIL)
);
System.out.println("Record Key String: " + record.getValue("reckeyStr"));
System.out.println("Record Key Integer: " + record.getValue("reckeyInt"));
System.out.println("Record Key BLOB: " + record.getValue("reckeyBlob"));
Output:
Record Key String: key1
Record Key Integer: null
Record Key BLOB: null
Unlock the power of expressions in Aerospike for enhanced data operations
Expressions are a very powerful tool in Aerospike. They can be used as filters for selecting records to return in primary or secondary index queries, as well as compute on a record’s existing data or metadata. It can then return the result of the computation as a read or write it to a new bin. The possibilities are only limited by your creativity!
To learn more, watch our Optimizing query performance with Aerospike Expressions webinar with Chief Developer Advocate Tim Faulkes.