Skip to content

Data masking

This page describes Aerospike’s data masking feature, available in Database 8.1.1.

Data masking overview

Data masking obfuscates sensitive data, such as Personally Identifiable Information (PII), by applying dynamic transformations. Because this is dynamic data masking, the underlying data stored on disk remains unaltered and the data is obscured in real-time during queries. You can use encryption at rest to protect data in the storage devices.

Users without appropriate privileges are served the masked value, while authorized users maintain access to the unmodified original data. This provides a critical layer of protection against accidental data exposure within application results.

Administrators define a masking rule by selecting a data masking function. This function applies a dynamic transformation to a specific bin of a dataset located within a designated namespace.

Defining a data masking rule automatically enables it for all users except for those granted permission to unmask it.

The critical role of data masking

Data masking answers a number of critical issues in database security and management.

  • Regulatory compliance and security

    Data masking is essential for adhering to various data privacy regulations and industry standards, such as:

    • GDPR (General Data Protection Regulation): Requires protecting the personal data of EU residents.
  • Reserve Bank of India’s Cyber Security Framework: Mandates unauthorized access prevention for Indian financial institutions.

    • HIPAA (Health Insurance Portability and Accountability Act): Mandates the protection of patient health information (PHI).

    • PCI DSS (Payment Card Industry Data Security Standard): Requires protecting cardholder data.

    Risk mitigation against unauthorized data exposure and legal penalties is achieved by applying a dynamic transformation to specific bins of PII, such as credit card details, Social Security, or Aadhaar numbers. Data masking ensures that sensitive data within a namespace is protected during real-time access.

  • Risk mitigation Masking minimizes the risk of a data breach when dealing with non-production environments.

    • In development or testing environments, developers, testers, and third-party vendors often need realistic data to ensure applications function correctly.

    • If they use copies of the real production data, any security oversight, accidental exposure, or malicious intent in these non-production systems could lead to a catastrophic data leak.

    • Data masking makes the data useless to an attacker even if the non-production environment is compromised, because the masked data is not the actual sensitive information.

    Maintaining data utility and integrity

    Unlike simply deleting or scrambling data randomly, effective data masking techniques ensure the masked data remains structurally and contextually similar to the original. This is vital because:

    • Application testing

      The application still receives the correct data type, for example, a properly formatted email address or a valid-looking 16-digit credit card number, and data length, allowing features, performance, and logic to be tested accurately without using real customer data.

    • Enabling outsourcing and collaboration

      When working with external vendors, contractors, or offshore teams for development, testing, or analytics, data masking is the best way to share necessary data for their tasks without granting them access to actual customer secrets. It enables secure collaboration and ensures proprietary or private customer data stays within the organization’s control.

When to use data masking

  • You need to copy data to a less secure environment (dev, test, vendor).
  • You need to prevent PII from being visible to human users and unauthorized applications.

Masking methods

Aerospike offers redactions and constant replacements for masking.

Redactions

Redaction replaces sensitive data elements with a placeholder character like an asterisk ’*’ or ‘X’, or a defined, irreversible pattern. It can be applied fully or partially.

You would use redactions in the following cases:

  • Verification

    Used when an application needs to display a partial identifier for a user to confirm their identity, such as “Is this your credit card ending in 1234?”.

  • Compliance

    Used to meet regulatory requirements (like PCI DSS) that mandate obscuring the majority of a card number from display, while showing only the last few digits as shown in the following example.

Constants (nulling out / fixed value)

The constant masking method replaces every original value in a sensitive bin with the exact same predefined fixed value or string across all records.

You would use constants in the following cases:

  • Complete removal of PII

    Used when a sensitive field like a customer’s specific name as shown in the following example. This provides no utility for the testing or development environment, and its complete, irreversible removal is the goal.

  • Eliminating variance

    Used to enforce a field is ignored by downstream analysis. For instance, replacing all “State of Origin” values with “UNKNOWN” ensures testing won’t be skewed by regional data variance.

  • Unneeded Data

    Ideal for scenarios where data is mandatory for the database schema but is irrelevant or dangerous in the non-production environment, such as replacing the LastLoginIPAddress with 127.0.0.1 or 0.0.0.0).

Example of constants
---
value: Peter Griffin
masking rule: Replace with John Doe
result: John Doe
---
value: Male
masking rule: Replace with ""
result: ""

Masking events, such as creating or modifying rules, are logged by the masking info command to the audit trail under the action masking.

When a masking rule is violated by an operation, the detail string will log the information about the attempt.

Built-in masking functions

FunctionParameterType
constantvaluestring
constantvalueinteger
constantvaluefloat
constantvalueboolean
redactposition, length, valuestring
Feedback

Was this page helpful?

What type of feedback are you giving?

What would you like us to know?

+Capture screenshot

Can we reach out to you?