We are excited to be a part of AWS re:Invent 2024. Visit us at booth #1844 in Las Vegas.More info
Blog

How to retrieve the key in a key-value store

Learn how to retrieve user-specified keys in Aerospike using single record reads.

PiyushGupta 1710292618663
Piyush Gupta
Director, Customer Enablement
March 12, 2024|8 min read

In this blog, you’ll learn how to retrieve the user-specified key in Aerospike, a key-value store.

Aerospike is designed from the ground up to support extremely low latency create, read, update, and delete (CRUD) operations. In Aerospike, applications store data as individual records that are uniformly distributed across all the nodes of the database cluster. A user-specified key is associated with each record along with the application data. Below, we will walk you through the steps necessary to retrieve your user-specified key in a single record read operation.

Keys in Aerospike

As mentioned above, Aerospike stores user data in individual records, which are accessible by a user-specified "key." Keys can only be a string, integer, or a byte array.

[USER SPECIFIED KEY] → [USER DATA in one or more "bins"]
RetreivingKeyBlog Fig 1 AerospikeKey 1710289926115(1)

Applications access the data via the Aerospike client library, which computes a 20-byte “digest” by hashing the user-supplied key information of set name, user key value, and user key type.

The client library sends the 20-byte digest as the key of record to the server. So, the record in the server should read:

[20 Byte Digest] → [USER DATA in one or more "bins"]

Storing keys in records

What if two keys hash to the same digest value, resulting in a hash collision?

Aerospike uses the RIPEMD-160 hashing algorithm, which has a statistically near-zero probability of hash collision. By default, Aerospike does not store the user-supplied key with the saved record. However, applications have the option to store the user-specified key along with the record via the write policy.

[20 Byte Digest ] → [USER DATA in one or more "bins"] + [User Key (string / int / blob)]

The approach above ensures that the server verifies the hash against the stored key during CRUD operations. Any subsequent updates where a hash collision is detected, the application will receive a KEY_MISMATCH error.

Key mismatch explained

Consider the case in a namespace test, where a user-specified key of abc in set testSet1 generates a particular digest, digest1. When this record is first created, Aerospike uses digest1 as the internal key of that record but also stores the user-specified key of abc with the record.

Hypothetically, in the same namespace test, consider the user-specified key of xyz in set testSet2, which also happens to hash to the same value, digest1. (This has near zero probability, practically speaking.)

If the application tries to create or update a record with digest1 and with user-specified key xyz, Aerospike will access the record stored against digest1 and compare the stored key abc against the incoming transaction's user-specified key of xyz and discover that they are different. A KEY_MISMATCH exception will be thrown in this case.

User key stored with the record

The curious may want to know: Is it possible to read this user-specified stored key back in the application?

For data modeling purposes and for queries (primary or secondary index queries that retrieve qualifying records from the entire namespace), the user stored key can be retrieved back to the client. This is useful since the user may want to then update individual records selectively, as part of their use case.

However, for single key reads the user key is not returned to the client, as you must have had the key to be able to read the record in the first place. This type of read can be done by passing either the user key and the set name or the digest.

In certain operational situations, users have asked if it is possible to read back the user stored key on an individual record read. They stored their key with the record and would now like to retrieve it. These are situations where users have the 20-byte digest of a record, but they do not know the user key. The digest was recorded in the server logs. An example of this would be when turning on rw-client logging to identify the digests of "hot-keys," records that are being updated very frequently.

The read APIs don't provide this functionality upfront via the policies since this is not a typical data modeling need. However, Aerospike does provide expressions that can access the user stored key.

What are Aerospike Expressions?

An expression is a strongly typed domain-specific language designed for manipulating and comparing bins and record metadata. There is no “programming language” per se; an expression is built with code. It runs on the nodes in the cluster, not on the client.

Aerospike supports three types of expressions:

  1. Filter expressions (introduced in version 5.2.0)

  2. Cross Datacenter Replication

    filter expressions (introduced in version 5.3.0)

  3. Operation expressions (introduced in version 5.6.0)

Filter and operation expressions can be specified on a per-call basis.

Key retrieval code example

Below is an example code snippet demonstrating how to retrieve a key using the Java client.

Creating the record initially:

AerospikeClient client = new AerospikeClient("127.0.0.1", 3000);
Key key = new Key("test", "testset", "key1");
WritePolicy wPolicy = new WritePolicy();
wPolicy.sendKey = true;  //Store our user key on the server
Bin b0 = new Bin("b00", Value.get(28));
client.put(wPolicy, key, b0);

Now, assume we just have the 20-byte digest and would like to retrieve our key:

We can use the following key constructor:

// Initialize key from namespace, digest, optional set name, and optional userKey.
Key(String namespace, byte[] digest, String setName, Value userKey)

Let's find the digest of the record we created above, for purposes of this demo.

byte[] recDigest = new byte[20];
recDigest =  Key.computeDigest("testset", Value.get("key1")) ;

Next, create a new key object using the digest-based constructor.

Key keyByDigest = new Key("test", recDigest, null, Value.NULL);

About Value.NULL

The constructor allows specifying the user-specified key in the last argument as a “Value" type object. For example, Value.get("myKey"). This is provided for the rare case where the application wants to do an update with the key generated using the digest, and the set name is not known but still wants to update with sendKey=true. In our case, we don't know the key and are trying to retrieve it in an operate() call, which is like a write, although we will only perform a read via expressions. Hence, we pass Value.NULL for the user-specified key and use sendKey=false.

We should be able to read the same record using either "key" or "keyDigest." Let's validate.

System.out.println("Record: "+ client.get(null, key)); 
System.out.println("Record via Digest: "+ client.get(null, keyByDigest));

Output:

Record: (gen:12),(exp:447740032),(bins:(b00:28))
Record via Digest: (gen:12),(exp:447740032),(bins:(b00:28))

Case 1: key type known

Now, using the digest-based key, let's retrieve the stored key from the server. Since we are specifying the digest and Value.NULL for the userKey in our recDigest constructor, we must set the wPolicy.sendKey = false; to avoid a KEY_MISMATCH error.

wPolicy.keySend = false;
Expression recKeyExp = Exp.build(Exp.key(Exp.Type.STRING));
Record record = client.operate( wPolicy, keyByDigest,   
          ExpOperation.read("reckey", recKeyExp, ExpReadFlags.DEFAULT) 
         );
System.out.println("Record Key: " + record.getValue("reckey"));

Recall we mentioned that user-specified keys can be of type integer, string, or byte-array. Here, although we don't know our key, we know it is a string type. This is specified using Exp.Type.STRING when building the expression for the key.

Output:

Record Key: key1

Case 2: key type unknown

What if we don't know the key type? We can check for all the three allowable types. Although, in this case, we must use ExpReadFlags.EVAL_NO_FAIL instead of ExpReadFlags.DEFAULT so that the failed expressions with type mismatch don't generate an Exception.

Example code is shown below.

wPolicy.sendKey = false;
Expression recKeyExpStr = Exp.build(Exp.key(Exp.Type.STRING));
Expression recKeyExpInt = Exp.build(Exp.key(Exp.Type.INT));
Expression recKeyExpBlob = Exp.build(Exp.key(Exp.Type.BLOB));
Record record = client.operate( wPolicy, keyByDigest,   
          ExpOperation.read("reckeyStr", recKeyExpStr, ExpReadFlags.EVAL_NO_FAIL),
          ExpOperation.read("reckeyInt", recKeyExpInt, ExpReadFlags.EVAL_NO_FAIL),
          ExpOperation.read("reckeyBlob", recKeyExpBlob, ExpReadFlags.EVAL_NO_FAIL)  
         );
System.out.println("Record Key String: " + record.getValue("reckeyStr"));
System.out.println("Record Key Integer: " + record.getValue("reckeyInt"));
System.out.println("Record Key BLOB: " + record.getValue("reckeyBlob"));

Output:

Record Key String: key1
Record Key Integer: null
Record Key BLOB: null

Unlock the power of expressions in Aerospike for enhanced data operations

Expressions are a very powerful tool in Aerospike. They can be used as filters for selecting records to return in primary or secondary index queries, as well as compute on a record’s existing data or metadata. It can then return the result of the computation as a read or write it to a new bin. The possibilities are only limited by your creativity!

To learn more, watch our Optimizing query performance with Aerospike Expressions webinar with Chief Developer Advocate Tim Faulkes.