# Naming conventions

A data model is an application-level contract. Aerospike has no server-enforced schema, so the team agrees on namespace, set, key format, bin names, and bin types so that every client reads and writes the same logical structure. This page covers the conventions that keep that contract clear and maintainable.

## Bin naming

Bin names in Aerospike have a hard limit of **15 characters**. This is a storage constraint, not a configuration option — names that exceed it are rejected.

Start with a descriptive name and abbreviate only when you hit the limit. The goal is for a bin name to be recognizable in logs, tooling output, and ad-hoc queries without requiring a lookup table.

Common abbreviations:

| Full name | Abbreviated | Notes |
| --- | --- | --- |
| `created_at_ms` | `created_at_ms` | 13 characters, fits as-is |
| `updated_at_ms` | `updated_at_ms` | 14 characters, fits as-is |
| `publish_date_ms` | `pub_date_ms` | Abbreviate the least ambiguous word |
| `last_modified_ms` | `last_mod_ms` | 12 characters |
| `follower_count` | `follower_cnt` | 13 characters |
| `notification_type` | `notif_type` | 11 characters |

When a name does not fit, abbreviate systematically: drop vowels from the longest word first, then truncate. Keep the suffix (such as `_ms` or `_cnt`) intact because it carries type information.

Avoid single-character or highly ambiguous names like `t`, `ts`, `d`, or `val`. These save a few characters but make the data model opaque to anyone reading it for the first time.

## Identifier formats

Choose the identifier format per entity based on two factors: the access pattern and how heavily the identifier is repeated in the data model.

**Cleartext composite** identifiers (for example, `alice-1742468400000`) keep their components visible. You can construct them from known values, inspect them in logs, and debug with command-line tools. The trade-off is variable length.

**Hashed** identifiers (for example, a 16-character hex digest) are compact and fixed-size. When an identifier appears as a map key or list element in thousands of records, fixed-size identifiers simplify capacity planning and reduce storage.

The decision driver is **repetition pressure**:

-   If an identifier mostly serves as a record key and appears in few other places, cleartext composites give better operational visibility at negligible cost.
-   If an identifier appears as a nested value across many records (for example, a `comment_id` embedded in list bins across thousands of post records), a fixed-size hash saves measurable space at scale.

Choose the format per entity, not globally. A `userId` that serves primarily as a record key can be cleartext, while a `commentId` that appears in CDT structures across many records might benefit from a compact hash.

### Deterministic hashing

When you hash identifiers, make the inputs and algorithm explicit in the data model contract:

-   **Algorithm**: for example, xxHash64 producing a 16-character hex string.
-   **Canonical input**: the exact string that gets hashed, with field order, delimiter, and encoding specified. For example, `xxHash64("alice" + "-" + "1742468400000")`.
-   **Collision policy**: what happens if two inputs produce the same hash. For 16-character hex (64-bit), collisions are extremely unlikely at moderate scale but must be acknowledged.
-   **Cross-client parity**: all clients (application servers, batch jobs, migration tools) must use the same algorithm and canonical input format.

Document these decisions in the data model contract so that any new client or service can produce the same identifiers independently.

## Timestamp contracts

Timestamps appear throughout Aerospike data models — as bin values, map keys, ID components, and TTL inputs. Inconsistent timestamp formats cause subtle bugs that are difficult to diagnose.

Adopt one canonical unit per field and encode it in the bin name:

**Epoch milliseconds as an integer** with the `_ms` suffix is the recommended default. Millisecond precision is sufficient for most application timestamps, and integer storage is compact and sortable.

```plaintext
created_at_ms   →  1742468400000   (epoch milliseconds, immutable)

updated_at_ms   →  1742472000000   (epoch milliseconds, mutable)

expires_at_ms   →  1742554800000   (epoch milliseconds, TTL input)
```

For time ranges and windows, use paired suffixes:

```plaintext
valid_from_ms   →  1742468400000

valid_to_ms     →  1742554800000
```

Or use `_until_ms` for a single upper bound:

```plaintext
locked_until_ms →  1742472000000
```

### What to avoid

-   **Mixed formats.** Do not store epoch milliseconds in one bin and an ISO-8601 string in another bin for the same semantic concept. If a `created_at_ms` bin holds an integer and a `created_at` bin elsewhere holds a string, every consumer must know which format to expect.
-   **Ambiguous units.** A bin named `timestamp` or `ts` does not tell the reader whether the value is seconds, milliseconds, or microseconds. Use the `_ms` suffix (or `_us`, `_ns`) to make the unit explicit.
-   **String timestamps for sortable data.** Strings are lexicographically sortable only if they are in a fixed-width format (for example, ISO-8601 with zero-padded fields). Integer epoch values are numerically sortable without format constraints.

### Record metadata versus bin values

Aerospike records carry server-managed metadata including last-update time (LUT) and time to live (TTL). These are server-controlled and have specific semantics:

-   **LUT** updates automatically on every write. You cannot set it to an arbitrary value.
-   **TTL** controls automatic record expiration. It applies to the entire record, not to individual bins.

If the application needs an immutable “created at” timestamp, a user-controlled “last modified” timestamp, or a per-field expiry value, store those as bin values. Do not rely on LUT or TTL as substitutes for application-managed timestamps, because their semantics differ from what most applications expect.

## Putting it together

A well-documented data model contract includes:

-   **Namespace and set names** for each entity type.
-   **Key format** with delimiters, component order, and type (string or integer). See [Key design](https://aerospike.com/docs/develop/data-modeling/key-design) for patterns.
-   **Bin inventory** with name, type, mutability (immutable / slowly-changing / frequently-changing), and unit for any numeric value.
-   **Identifier specification** per entity: cleartext or hashed, algorithm if hashed, canonical input format.
-   **Timestamp unit** for every time-valued bin, explicit in the name.
-   **Collection schemas** for CDT bins: element structure, expected cardinality range, and ordering. See [Collections](https://aerospike.com/docs/develop/data-modeling/collections) for patterns.

This contract is not an Aerospike feature — there is no server-side enforcement. It is a team agreement, documented alongside the application code, that prevents drift as the system evolves.