HyperLogLog data type
The HyperLogLog bin data type gives you estimated counts of members in a large dataset for your application to form fast, reasonable approximations of members in the union or intersection between multiple HyperLogLog bins. HyperLogLog’s estimates are a balance between complete accuracy and efficient savings in space and speed in dealing with extremely large datasets.
Operations across HLL bin types are processed on the server side. Only results are returned to the client.
In this discussion, the words “set” and “dataset” are used with the meaning from mathematical set theory, not the Aerospike concept of sets.
Set theory fundamental background
The HyperLogLog data returned to your application is based on some basic ideas from set theory. The article Probabilistic: Definition, Models, and Theory Explained gives some brief background information.
Union of sets
Intersection of sets
Business use cases
The HyperLogLog data type is useful for any problem where your underlying data cannot give you exact answers. Some business use cases include deriving probable answers for the following needs:
Bank fraud
Count the number of suspicious indicators related to an account and its transactions to determine in real-time the probability of fraud of an incoming transaction.
Ad campaign scope
Given the user segments an ad is targeting, what is the approximate number of people who will see the ad?
Online sales conversion rate
By comparing user cohorts and their interest in adjacent items, how many possible customers who log in to the website will finally end up purchasing something in the same user session? In multiple sessions?
HyperLogLog data and data modeling
As with all Aerospike data types, your own values define the underlying HyperLogLog data. You need to model your data on values that have some intrinsic relationship so that you can derive counts of membership in individual data sets or the union or intersection of multiple datasets. In colloquial words, you must “compare apples to apples but not apples to oranges”.
Continuing the example of user segmentation in online ad campaigns, you might have records representing user segments, each with a HyperLogLog bin to represent the membership in the segment. With the data returned by batch reading the bins of multiple segments, you can ask the server for the count of the members in an intersection of multiple segments. For example:
- All the people who love basketball. This is one segment, represented in an HLL bin of one record.
- All the people who like the Golden State Warriors, another segment.
- All the people who also like hats as a fashion accessory.
The meaning of the HLL bins is specific to the use case of identifying users as parts of audience segments, and counting those segments and intersections and unions between them for ad campaigns.
As a baseline for comparison, you can compare between records with HyperLogLog bins representing the same kind of data set specific to your use case. At a minimum, you need to define two HyperLogLog data sets so that their relationships can be explored. The exact meaning of the data is up to you.
- Continuing the example of bank fraud in the use cases described above, you have records that you are certain show fraud. With the data returned by HyperLogLog, your application can find the probability that other records match the pattern of the identified records.
- Continuing the example of online sales conversion rate in the use cases described above,
the following need to be defined:
- Date/time of login to the website
- Date/time of purchase
What the HyperLogLog data type returns to your application
The HyperLogLog returns the following information to your application:
- The estimated size of a set.
- The estimated cardinality of the union of multiple sets.
- The estimated similarity of multiple sets.
- The estimated cardinality of the intersection of multiple sets.
- The estimated union of multiple sets. This estimate is returned as a HyperLogLog data type.
Calculating the size of the returned data
Because the HyperlogLog data type is for working with large data sets, the data it returns can also be large. You need to be aware of the storage cost of a HyperLogLog bin. For small sets a data type other than HyperLogLog might suffice, but for large data sets the unchanging storage size is extremely advantageous.
-
Each HLL contains 11 bytes of metadata and an array of 2n_index_bits registers.
-
Each register contains 6 bits of hll_val and n_minhash_bits bits of minhash_val. The size of the registers is rounded up to the nearest byte.
sizeof(HLL) = 11 + roundUpToByte(2n_index_bits × (6 + n_minhash_bits))
For general guidelines on bin overheads, see Linux Capacity Planning.
Error Bounds
| Operation | Error type | Error formula | Notes |
|---|---|---|---|
refresh_countget_countget_union_count | Relative | Increasing bits exponentially reduces relative error. For example, 4 bits → 26%, 10 bits → 3.25%, 16 bits → 0.41%. | |
get_similarity | Absolute | Select bits to meet target error e, and tune parameters based on desired accuracy and similarity threshold. | |
get_intersect_count | Absolute or Relative | Use enough minhash bits for reliable intersections: - Stable if n_minhash_bits > 0- Unstable if n_minhash_bits = 0 |
Performance
| Symbol | Description |
|---|---|
| C | Cost of memcpy (added to all modifies). |
| E | Number of entries passed to the operation. |
| K | Number of HLLs being operated on. |
| M | Number of minhash bits. |
| N | Number of index bits. |
| R | Cost of storage read (applies to any transaction - only once per transaction). |
| S | Size of a HLL in bytes (M + 1) × 2N |
| W | Cost of writing to storage (applies to any modify transaction - only once per transaction). |
| Operation | HyperLogLog (n_minhash_bits = 0) | HyperMinHash (n_minhash_bits > 0) |
|---|---|---|
init | S | S |
add | E | E |
set_union | K × S | K × S |
refresh_count | S | S |
fold | K × S | K × S |
get_count | S | S |
get_union | K × S | K × S |
get_union_count | K × S | K × S |
get_intersect_count | K! × S 0 ≤ K ≤ 2 | K × S + S |
get_similarity | K! × S 0 ≤ K ≤ 2 | K × S + S |
describe | 1 | 1 |
Operations
HyperLogLog relies on the following APIs the clients applications use:
| Name | Value | Description |
|---|---|---|
create_only | 0x01 | Disallow updating an existing value of this bin. |
update_only | 0x02 | Disallow creation of a new Blob bin. |
no_fail | 0x04 | Allow a set of operations to proceed if an individual operation would fail due to a policy violation. |
allow_fold | 0X08 | Allow the resulting set to be the minimum of provided n_index_bits. For intersect_counts and similarity, allow the usage of less precise HLL algorithms when n_minhash_bits of all participating sets do not match. |
Modify operations
Adds values to HLL set. If HLL bin does not exist, use n_index_bits to create HLL bin.
add(policy, bin_name, items, n_index_bits)Adds values to HLL set. If HLL bin does not exist, use n_index_bits and n_minhash_bits to create HLL bin.
add_mh(policy, bin_name, items, n_index_bits, n_minhash_bits)Adds values to HLL set. The HLL bin must already exist.
update(policy, bin_name, items)| Name | Type | Description |
|---|---|---|
policy | library_specific | HLL modify policy. |
bin_name | string | Name of bin. |
n_index_bits | integer | Number of index bits. Must be between 4 and 16 inclusive. |
n_minhash_bits | integer | Number of minhash bits. Must be between 4 and 51 inclusive. |
integer Add user IDs to an audience-segment HLL. If the bin does not yet exist it is created with 10 index bits. The return value is the estimated number of elements that caused the internal count to change (new unique members). Adding the same user ID again does not increase the count.
// Add user IDs to a "visitors" HLL, creating it with 10 index bits if neededList<Value> users = Arrays.asList( Value.get("user-1001"), Value.get("user-1002"), Value.get("user-1003"));
Record record = client.operate(null, key, HLLOperation.add(HLLPolicy.Default, "visitors", users, 10));// record.getInt("visitors") → estimated new unique items added# Add user IDs to a "visitors" HLL, creating it with 10 index bits if needed_, _, bins = client.operate(key, [ hll_operations.hll_add( "visitors", ["user-1001", "user-1002", "user-1003"], index_bit_count=10, )])# bins["visitors"] → estimated new unique items addedusers := []as.Value{ as.NewStringValue("user-1001"), as.NewStringValue("user-1002"), as.NewStringValue("user-1003"),}
record, err := client.Operate(nil, key, as.HLLAddOp(as.DefaultHLLPolicy(), "visitors", users, 10, -1),)// record.Bins["visitors"] → estimated new unique items addedas_arraylist items;as_arraylist_inita(&items, 3);as_arraylist_append_str(&items, "user-1001");as_arraylist_append_str(&items, "user-1002");as_arraylist_append_str(&items, "user-1003");
as_operations ops;as_operations_inita(&ops, 1);as_operations_hll_add(&ops, "visitors", NULL, NULL, (as_list*)&items, 10);
as_record* rec = NULL;aerospike_key_operate(&as, &err, NULL, &key, &ops, &rec);// Add user IDs to a "visitors" HLL, creating it with 10 index bits if neededIList users = new List<Value> { Value.Get("user-1001"), Value.Get("user-1002"), Value.Get("user-1003")};
Record record = client.Operate(null, key, HLLOperation.Add(HLLPolicy.Default, "visitors", users, 10));// record.GetInt("visitors") → estimated new unique items addedconst Aerospike = require('aerospike')const hll = Aerospike.hll
const result = await client.operate(key, [ hll.add('visitors', ['user-1001', 'user-1002', 'user-1003'], 10)])// result.bins.visitors → estimated new unique items addedfold(bin_name, n_index_bits)Folds the HLL bin to the specified n_index_bits.
Fails if existing HLL has n_minhash_bits set to non-zero.
| Name | Type | Description |
|---|---|---|
bin_name | string | Name of bin. |
n_index_bits | integer | Number of index bits. Must be between 4 and 16 inclusive. |
none Reduce the precision of an HLL bin from its current index bit count down to a lower value. This shrinks storage at the cost of wider error bounds. Folding is useful when combining HLLs that were created with different index bit counts — fold the higher one down before calling set_union. Folding is irreversible and fails if the HLL has minhash bits set.
// Fold a 12-bit HLL down to 8 index bitsRecord record = client.operate(null, key, HLLOperation.fold("visitors", 8));# Fold a 12-bit HLL down to 8 index bits_, _, bins = client.operate(key, [ hll_operations.hll_fold("visitors", index_bit_count=8)])record, err := client.Operate(nil, key, as.HLLFoldOp("visitors", 8),)as_operations ops;as_operations_inita(&ops, 1);as_operations_hll_fold(&ops, "visitors", NULL, 8);
as_record* rec = NULL;aerospike_key_operate(&as, &err, NULL, &key, &ops, &rec);// Fold a 12-bit HLL down to 8 index bitsRecord record = client.Operate(null, key, HLLOperation.Fold("visitors", 8));const Aerospike = require('aerospike')const hll = Aerospike.hll
const result = await client.operate(key, [ hll.fold('visitors', 8)])Initializes or resets a standard HyperLogLog.
init(policy, bin_name, n_index_bits)Initializes or resets a HyperLogLog with minhash information (see HyperMinHash) bits to improve accuracy of intersection and similarity estimates.
init(policy, bin_name, n_index_bits, n_minhash_bits)| Name | Type | Description |
|---|---|---|
policy | library_specific | HLL modify policy. |
bin_name | string | Name of bin. |
n_index_bits | integer | Number of index bits. It must be between 4 and 16 inclusive. |
n_minhash_bits | integer | Number of minhash bits. It must be between 4 and 51 inclusive. |
hll_bits + minhash_bits | integer | The sum of index and minhash bits, it must be less than or equal to 64 bits inclusive. |
none Create an HLL bin on a record that represents an ad-campaign audience segment. Use 10 index bits for roughly 3% relative error on cardinality estimates. After initialization the bin is empty (count 0) and ready to receive members via add.
// Initialize an HLL bin with 10 index bitsRecord record = client.operate(null, key, HLLOperation.init(HLLPolicy.Default, "visitors", 10));# Initialize an HLL bin with 10 index bits_, _, bins = client.operate(key, [ hll_operations.hll_init("visitors", index_bit_count=10)])record, err := client.Operate(nil, key, as.HLLInitOp(as.DefaultHLLPolicy(), "visitors", 10, -1),)as_operations ops;as_operations_inita(&ops, 1);as_operations_hll_init(&ops, "visitors", NULL, NULL, 10);
as_record* rec = NULL;aerospike_key_operate(&as, &err, NULL, &key, &ops, &rec);// Initialize an HLL bin with 10 index bitsRecord record = client.Operate(null, key, HLLOperation.Init(HLLPolicy.Default, "visitors", 10));const Aerospike = require('aerospike')const hll = Aerospike.hll
const result = await client.operate(key, [ hll.init('visitors', 10)])refresh_count(bin_name)Updates the cached count (if stale) and returns the count. For relative error see Error Bounds.
| Name | Type | Description |
|---|---|---|
bin_name | string | Name of bin. |
integer Record record = client.operate(null, key, HLLOperation.refreshCount("visitors"));long count = record.getLong("visitors");_, _, bins = client.operate(key, [ hll_operations.hll_refresh_count("visitors")])count = bins["visitors"]record, err := client.Operate(nil, key, as.HLLRefreshCountOp("visitors"),)count := record.Bins["visitors"]as_operations ops;as_operations_inita(&ops, 1);as_operations_hll_refresh_count(&ops, "visitors", NULL);
as_record* rec = NULL;aerospike_key_operate(&as, &err, NULL, &key, &ops, &rec);int64_t count = as_record_get_int64(rec, "visitors", 0);Record record = client.Operate(null, key, HLLOperation.RefreshCount("visitors"));long count = record.GetLong("visitors");const Aerospike = require('aerospike')const hll = Aerospike.hll
const result = await client.operate(key, [ hll.refreshCount('visitors')])const count = result.bins.visitorsset_union(policy, bin_name, hlls)Sets union of specified list of HLLs with HLL bin.
| Name | Type | Description |
|---|---|---|
policy | library_specific | HLL modify policy. |
bin_name | string | Name of bin. |
hlls | list | List of HLL objects. |
none Merge one or more external HLL values into a bin. A common pattern is to read the raw HLL bytes from another record (for example a different audience segment) and merge them into the current record’s HLL. After set_union the bin contains every member from all contributing HLLs, and get_count returns the estimated cardinality of the combined set.
// Read the HLL from a second segment recordRecord other = client.get(null, otherKey, "visitors");Value.HLLValue otherHll = other.getHLLValue("visitors");
// Merge it into the current record's HLLRecord record = client.operate(null, key, HLLOperation.setUnion(HLLPolicy.Default, "visitors", Arrays.asList(otherHll)));# Read the HLL from a second segment record_, _, other = client.get(other_key)other_hll = other["visitors"]
# Merge it into the current record's HLL_, _, bins = client.operate(key, [ hll_operations.hll_set_union("visitors", [other_hll])])otherRec, err := client.Get(nil, otherKey, "visitors")otherHll := otherRec.Bins["visitors"].(as.HLLValue)
record, err := client.Operate(nil, key, as.HLLSetUnionOp(as.DefaultHLLPolicy(), "visitors", []as.HLLValue{otherHll}),)// Read HLL bytes from another recordas_record* other = NULL;aerospike_key_get(&as, &err, NULL, &other_key, &other);as_bytes* other_hll = as_record_get_bytes(other, "visitors");
as_arraylist hll_list;as_arraylist_inita(&hll_list, 1);as_arraylist_append_bytes(&hll_list, other_hll);
as_operations ops;as_operations_inita(&ops, 1);as_operations_hll_set_union(&ops, "visitors", NULL, NULL, (as_list*)&hll_list);
as_record* rec = NULL;aerospike_key_operate(&as, &err, NULL, &key, &ops, &rec);// Read the HLL from a second segment recordRecord other = client.Get(null, otherKey, "visitors");Value.HLLValue otherHll = new Value.HLLValue( (byte[])other.GetValue("visitors"));
// Merge it into the current record's HLLRecord record = client.Operate(null, key, HLLOperation.SetUnion(HLLPolicy.Default, "visitors", new List<Value.HLLValue> { otherHll }));const Aerospike = require('aerospike')const hll_ops = Aerospike.hll
// Read the HLL from a second segment recordconst other = await client.get(otherKey, ['visitors'])const otherHll = other.bins.visitors
// Merge it into the current record's HLLconst result = await client.operate(key, [ hll_ops.setUnion('visitors', [otherHll])])Read operations
describe(bin_name)List containing the HLL bin’s configured n_index_bits and n_minhash_bits.
| Name | Type | Description |
|---|---|---|
bin_name | string | Name of bin. |
list Record record = client.operate(null, key, HLLOperation.describe("visitors"));List<?> desc = record.getList("visitors");long indexBits = (long)desc.get(0);long minhashBits = (long)desc.get(1);_, _, bins = client.operate(key, [ hll_operations.hll_describe("visitors")])index_bits, minhash_bits = bins["visitors"]record, err := client.Operate(nil, key, as.HLLDescribeOp("visitors"),)desc := record.Bins["visitors"].([]interface{})indexBits := desc[0] // n_index_bitsminhashBits := desc[1] // n_minhash_bitsas_operations ops;as_operations_inita(&ops, 1);as_operations_hll_describe(&ops, "visitors", NULL);
as_record* rec = NULL;aerospike_key_operate(&as, &err, NULL, &key, &ops, &rec);as_list* desc = as_record_get_list(rec, "visitors");int64_t index_bits = as_list_get_int64(desc, 0);int64_t minhash_bits = as_list_get_int64(desc, 1);Record record = client.Operate(null, key, HLLOperation.Describe("visitors"));IList desc = record.GetList("visitors");long indexBits = (long)desc[0];long minhashBits = (long)desc[1];const Aerospike = require('aerospike')const hll = Aerospike.hll
const result = await client.operate(key, [ hll.describe('visitors')])const [indexBits, minhashBits] = result.bins.visitorsget_count(bin_name)Estimate of the number of unique entries in the HLL set. For relative error see Error Bounds.
| Name | Type | Description |
|---|---|---|
bin_name | string | Name of bin. |
integer After adding user IDs with add, read the estimated cardinality. The relative error depends on n_index_bits — for example, 10 bits gives roughly 3.25% relative error. get_count uses the cached count and does not recompute it; call refresh_count first if the bin has been modified since the last count.
Record record = client.operate(null, key, HLLOperation.getCount("visitors"));long estimated = record.getLong("visitors");_, _, bins = client.operate(key, [ hll_operations.hll_get_count("visitors")])estimated = bins["visitors"]record, err := client.Operate(nil, key, as.HLLGetCountOp("visitors"),)estimated := record.Bins["visitors"]as_operations ops;as_operations_inita(&ops, 1);as_operations_hll_get_count(&ops, "visitors", NULL);
as_record* rec = NULL;aerospike_key_operate(&as, &err, NULL, &key, &ops, &rec);int64_t estimated = as_record_get_int64(rec, "visitors", 0);Record record = client.Operate(null, key, HLLOperation.GetCount("visitors"));long estimated = record.GetLong("visitors");const Aerospike = require('aerospike')const hll = Aerospike.hll
const result = await client.operate(key, [ hll.getCount('visitors')])const estimated = result.bins.visitorsget_intersect_count(bin_name, hlls)Estimate of the number of elements that would be contained by the intersection of these HLL objects and the HLL bin. For relative error see Error Bounds.
| Name | Type | Description |
|---|---|---|
bin_name | string | Name of bin containing an HLL value. |
hlls | list | List of HLL objects. If HLL minhash bits are 0, maximum 2 objects in list, otherwise may be greater than 2. |
integer Given two audience-segment records — “basketball fans” and “Warriors fans” — estimate how many users appear in both. Read the raw HLL bytes from the second record and pass them to get_intersect_count on the first. The result is the estimated cardinality of the intersection. When n_minhash_bits is 0, the list may contain at most 2 HLL objects.
// Read the Warriors-fans HLLRecord warriors = client.get(null, warriorsKey, "visitors");Value.HLLValue warriorsHll = warriors.getHLLValue("visitors");
// Estimate overlap with basketball-fans HLLRecord record = client.operate(null, basketballKey, HLLOperation.getIntersectCount("visitors", Arrays.asList(warriorsHll)));long overlap = record.getLong("visitors");# Read the Warriors-fans HLL_, _, warriors = client.get(warriors_key)warriors_hll = warriors["visitors"]
# Estimate overlap with basketball-fans HLL_, _, bins = client.operate(basketball_key, [ hll_operations.hll_get_intersect_count("visitors", [warriors_hll])])overlap = bins["visitors"]warriorsRec, err := client.Get(nil, warriorsKey, "visitors")warriorsHll := warriorsRec.Bins["visitors"].(as.HLLValue)
record, err := client.Operate(nil, basketballKey, as.HLLGetIntersectCountOp("visitors", []as.HLLValue{warriorsHll}),)overlap := record.Bins["visitors"]// Read the Warriors-fans HLLas_record* warriors = NULL;aerospike_key_get(&as, &err, NULL, &warriors_key, &warriors);as_bytes* warriors_hll = as_record_get_bytes(warriors, "visitors");
as_arraylist hll_list;as_arraylist_inita(&hll_list, 1);as_arraylist_append_bytes(&hll_list, warriors_hll);
as_operations ops;as_operations_inita(&ops, 1);as_operations_hll_get_intersect_count(&ops, "visitors", NULL, (as_list*)&hll_list);
as_record* rec = NULL;aerospike_key_operate(&as, &err, NULL, &basketball_key, &ops, &rec);int64_t overlap = as_record_get_int64(rec, "visitors", 0);// Read the Warriors-fans HLLRecord warriors = client.Get(null, warriorsKey, "visitors");Value.HLLValue warriorsHll = new Value.HLLValue( (byte[])warriors.GetValue("visitors"));
// Estimate overlap with basketball-fans HLLRecord record = client.Operate(null, basketballKey, HLLOperation.GetIntersectCount("visitors", new List<Value.HLLValue> { warriorsHll }));long overlap = record.GetLong("visitors");const Aerospike = require('aerospike')const hll = Aerospike.hll
// Read the Warriors-fans HLLconst warriors = await client.get(warriorsKey, ['visitors'])const warriorsHll = warriors.bins.visitors
// Estimate overlap with basketball-fans HLLconst result = await client.operate(basketballKey, [ hll.getIntersectCount('visitors', [warriorsHll])])const overlap = result.bins.visitorsget_similarity(bin_name, hlls)Estimate of the similarity (or Jaccard Index) of these HLL objects and the HLL bin. For absolute error see Error Bounds.
| Name | Type | Description |
|---|---|---|
bin_name | string | Name of bin containing an HLL value. |
hlls | list | List of HLL objects. If HLL minhash bits are 0, maximum 2 objects in the list, otherwise may be greater than 2. |
float The Jaccard similarity index is |A ∩ B| / |A ∪ B|, ranging from 0 (disjoint) to 1 (identical). Use get_similarity to estimate how much two audience segments overlap relative to their combined size. Higher accuracy requires non-zero n_minhash_bits at initialization time — see Error Bounds.
Record other = client.get(null, otherKey, "visitors");Value.HLLValue otherHll = other.getHLLValue("visitors");
Record record = client.operate(null, key, HLLOperation.getSimilarity("visitors", Arrays.asList(otherHll)));double similarity = record.getDouble("visitors");_, _, other = client.get(other_key)other_hll = other["visitors"]
_, _, bins = client.operate(key, [ hll_operations.hll_get_similarity("visitors", [other_hll])])similarity = bins["visitors"]otherRec, err := client.Get(nil, otherKey, "visitors")otherHll := otherRec.Bins["visitors"].(as.HLLValue)
record, err := client.Operate(nil, key, as.HLLGetSimilarityOp("visitors", []as.HLLValue{otherHll}),)similarity := record.Bins["visitors"]as_record* other = NULL;aerospike_key_get(&as, &err, NULL, &other_key, &other);as_bytes* other_hll = as_record_get_bytes(other, "visitors");
as_arraylist hll_list;as_arraylist_inita(&hll_list, 1);as_arraylist_append_bytes(&hll_list, other_hll);
as_operations ops;as_operations_inita(&ops, 1);as_operations_hll_get_similarity(&ops, "visitors", NULL, (as_list*)&hll_list);
as_record* rec = NULL;aerospike_key_operate(&as, &err, NULL, &key, &ops, &rec);Record other = client.Get(null, otherKey, "visitors");Value.HLLValue otherHll = new Value.HLLValue( (byte[])other.GetValue("visitors"));
Record record = client.Operate(null, key, HLLOperation.GetSimilarity("visitors", new List<Value.HLLValue> { otherHll }));double similarity = record.GetDouble("visitors");const Aerospike = require('aerospike')const hll = Aerospike.hll
const other = await client.get(otherKey, ['visitors'])const otherHll = other.bins.visitors
const result = await client.operate(key, [ hll.getSimilarity('visitors', [otherHll])])const similarity = result.bins.visitorsget_union(bin_name, hlls)Returns an HLL object that is the union of all specified HLL objects in the hlls list with the HLL bin.
| Name | Type | Description |
|---|---|---|
bin_name | string | Name of bin. |
hlls | list | List of HLL objects. |
HLL Unlike set_union, get_union does not modify the bin. It returns a new HLL value representing the union of the bin and the provided HLL list. The returned bytes can be used client-side — for example, passed to another record’s set_union, or used locally with get_union_count in the same operate call.
Record other = client.get(null, otherKey, "visitors");Value.HLLValue otherHll = other.getHLLValue("visitors");
Record record = client.operate(null, key, HLLOperation.getUnion("visitors", Arrays.asList(otherHll)));Value.HLLValue unionHll = record.getHLLValue("visitors");_, _, other = client.get(other_key)other_hll = other["visitors"]
_, _, bins = client.operate(key, [ hll_operations.hll_get_union("visitors", [other_hll])])union_hll = bins["visitors"] # raw HLL bytesotherRec, err := client.Get(nil, otherKey, "visitors")otherHll := otherRec.Bins["visitors"].(as.HLLValue)
record, err := client.Operate(nil, key, as.HLLGetUnionOp("visitors", []as.HLLValue{otherHll}),)unionHll := record.Bins["visitors"].(as.HLLValue)as_record* other = NULL;aerospike_key_get(&as, &err, NULL, &other_key, &other);as_bytes* other_hll = as_record_get_bytes(other, "visitors");
as_arraylist hll_list;as_arraylist_inita(&hll_list, 1);as_arraylist_append_bytes(&hll_list, other_hll);
as_operations ops;as_operations_inita(&ops, 1);as_operations_hll_get_union(&ops, "visitors", NULL, (as_list*)&hll_list);
as_record* rec = NULL;aerospike_key_operate(&as, &err, NULL, &key, &ops, &rec);as_bytes* union_hll = as_record_get_bytes(rec, "visitors");Record other = client.Get(null, otherKey, "visitors");Value.HLLValue otherHll = new Value.HLLValue( (byte[])other.GetValue("visitors"));
Record record = client.Operate(null, key, HLLOperation.GetUnion("visitors", new List<Value.HLLValue> { otherHll }));byte[] unionHll = (byte[])record.GetValue("visitors");const Aerospike = require('aerospike')const hll = Aerospike.hll
const other = await client.get(otherKey, ['visitors'])const otherHll = other.bins.visitors
const result = await client.operate(key, [ hll.getUnion('visitors', [otherHll])])const unionHll = result.bins.visitors // raw HLL Bufferget_union_count(bin_name, hlls)Estimate of the number of elements that would be contained by the union of these HLL objects and the HLL bin. For relative error see Error Bounds.
| Name | Type | Description |
|---|---|---|
bin_name | string | Name of bin. |
hlls | list | List of HLL objects. |
integer Given two audience-segment HLLs — for example “basketball fans” and “hat enthusiasts” — estimate the total number of distinct users across both segments. This is the estimated cardinality of the union, useful for ad-campaign reach projections. The bin itself is not modified.
Record other = client.get(null, otherKey, "visitors");Value.HLLValue otherHll = other.getHLLValue("visitors");
Record record = client.operate(null, key, HLLOperation.getUnionCount("visitors", Arrays.asList(otherHll)));long totalReach = record.getLong("visitors");_, _, other = client.get(other_key)other_hll = other["visitors"]
_, _, bins = client.operate(key, [ hll_operations.hll_get_union_count("visitors", [other_hll])])total_reach = bins["visitors"]otherRec, err := client.Get(nil, otherKey, "visitors")otherHll := otherRec.Bins["visitors"].(as.HLLValue)
record, err := client.Operate(nil, key, as.HLLGetUnionCountOp("visitors", []as.HLLValue{otherHll}),)totalReach := record.Bins["visitors"]as_record* other = NULL;aerospike_key_get(&as, &err, NULL, &other_key, &other);as_bytes* other_hll = as_record_get_bytes(other, "visitors");
as_arraylist hll_list;as_arraylist_inita(&hll_list, 1);as_arraylist_append_bytes(&hll_list, other_hll);
as_operations ops;as_operations_inita(&ops, 1);as_operations_hll_get_union_count(&ops, "visitors", NULL, (as_list*)&hll_list);
as_record* rec = NULL;aerospike_key_operate(&as, &err, NULL, &key, &ops, &rec);int64_t total_reach = as_record_get_int64(rec, "visitors", 0);Record other = client.Get(null, otherKey, "visitors");Value.HLLValue otherHll = new Value.HLLValue( (byte[])other.GetValue("visitors"));
Record record = client.Operate(null, key, HLLOperation.GetUnionCount("visitors", new List<Value.HLLValue> { otherHll }));long totalReach = record.GetLong("visitors");const Aerospike = require('aerospike')const hll = Aerospike.hll
const other = await client.get(otherKey, ['visitors'])const otherHll = other.bins.visitors
const result = await client.operate(key, [ hll.getUnionCount('visitors', [otherHll])])const totalReach = result.bins.visitors