Skip to content

Naming conventions

A data model is an application-level contract. Aerospike has no server-enforced schema, so the team agrees on namespace, set, key format, bin names, and bin types so that every client reads and writes the same logical structure. This page covers the conventions that keep that contract clear and maintainable.

Bin naming

Bin names in Aerospike have a hard limit of 15 characters. This is a storage constraint, not a configuration option — names that exceed it are rejected.

Start with a descriptive name and abbreviate only when you hit the limit. The goal is for a bin name to be recognizable in logs, tooling output, and ad-hoc queries without requiring a lookup table.

Common abbreviations:

Full nameAbbreviatedNotes
created_at_mscreated_at_ms13 characters, fits as-is
updated_at_msupdated_at_ms14 characters, fits as-is
publish_date_mspub_date_msAbbreviate the least ambiguous word
last_modified_mslast_mod_ms12 characters
follower_countfollower_cnt13 characters
notification_typenotif_type11 characters

When a name does not fit, abbreviate systematically: drop vowels from the longest word first, then truncate. Keep the suffix (such as _ms or _cnt) intact because it carries type information.

Avoid single-character or highly ambiguous names like t, ts, d, or val. These save a few characters but make the data model opaque to anyone reading it for the first time.

Identifier formats

Choose the identifier format per entity based on two factors: the access pattern and how heavily the identifier is repeated in the data model.

Cleartext composite identifiers (for example, alice-1742468400000) keep their components visible. You can construct them from known values, inspect them in logs, and debug with command-line tools. The trade-off is variable length.

Hashed identifiers (for example, a 16-character hex digest) are compact and fixed-size. When an identifier appears as a map key or list element in thousands of records, fixed-size identifiers simplify capacity planning and reduce storage.

The decision driver is repetition pressure:

  • If an identifier mostly serves as a record key and appears in few other places, cleartext composites give better operational visibility at negligible cost.
  • If an identifier appears as a nested value across many records (for example, a comment_id embedded in list bins across thousands of post records), a fixed-size hash saves measurable space at scale.

Choose the format per entity, not globally. A userId that serves primarily as a record key can be cleartext, while a commentId that appears in CDT structures across many records might benefit from a compact hash.

Deterministic hashing

When you hash identifiers, make the inputs and algorithm explicit in the data model contract:

  • Algorithm: for example, xxHash64 producing a 16-character hex string.
  • Canonical input: the exact string that gets hashed, with field order, delimiter, and encoding specified. For example, xxHash64("alice" + "-" + "1742468400000").
  • Collision policy: what happens if two inputs produce the same hash. For 16-character hex (64-bit), collisions are extremely unlikely at moderate scale but must be acknowledged.
  • Cross-client parity: all clients (application servers, batch jobs, migration tools) must use the same algorithm and canonical input format.

Document these decisions in the data model contract so that any new client or service can produce the same identifiers independently.

Timestamp contracts

Timestamps appear throughout Aerospike data models — as bin values, map keys, ID components, and TTL inputs. Inconsistent timestamp formats cause subtle bugs that are difficult to diagnose.

Adopt one canonical unit per field and encode it in the bin name:

Epoch milliseconds as an integer with the _ms suffix is the recommended default. Millisecond precision is sufficient for most application timestamps, and integer storage is compact and sortable.

created_at_ms → 1742468400000 (epoch milliseconds, immutable)
updated_at_ms → 1742472000000 (epoch milliseconds, mutable)
expires_at_ms → 1742554800000 (epoch milliseconds, TTL input)

For time ranges and windows, use paired suffixes:

valid_from_ms → 1742468400000
valid_to_ms → 1742554800000

Or use _until_ms for a single upper bound:

locked_until_ms → 1742472000000

What to avoid

  • Mixed formats. Do not store epoch milliseconds in one bin and an ISO-8601 string in another bin for the same semantic concept. If a created_at_ms bin holds an integer and a created_at bin elsewhere holds a string, every consumer must know which format to expect.
  • Ambiguous units. A bin named timestamp or ts does not tell the reader whether the value is seconds, milliseconds, or microseconds. Use the _ms suffix (or _us, _ns) to make the unit explicit.
  • String timestamps for sortable data. Strings are lexicographically sortable only if they are in a fixed-width format (for example, ISO-8601 with zero-padded fields). Integer epoch values are numerically sortable without format constraints.

Record metadata versus bin values

Aerospike records carry server-managed metadata including last-update time (LUT) and time to live (TTL). These are server-controlled and have specific semantics:

  • LUT updates automatically on every write. You cannot set it to an arbitrary value.
  • TTL controls automatic record expiration. It applies to the entire record, not to individual bins.

If the application needs an immutable “created at” timestamp, a user-controlled “last modified” timestamp, or a per-field expiry value, store those as bin values. Do not rely on LUT or TTL as substitutes for application-managed timestamps, because their semantics differ from what most applications expect.

Putting it together

A well-documented data model contract includes:

  • Namespace and set names for each entity type.
  • Key format with delimiters, component order, and type (string or integer). See Key design for patterns.
  • Bin inventory with name, type, mutability (immutable / slowly-changing / frequently-changing), and unit for any numeric value.
  • Identifier specification per entity: cleartext or hashed, algorithm if hashed, canonical input format.
  • Timestamp unit for every time-valued bin, explicit in the name.
  • Collection schemas for CDT bins: element structure, expected cardinality range, and ordering. See Collections for patterns.

This contract is not an Aerospike feature — there is no server-side enforcement. It is a team agreement, documented alongside the application code, that prevents drift as the system evolves.

Feedback

Was this page helpful?

What type of feedback are you giving?

What would you like us to know?

+Capture screenshot

Can we reach out to you?