Redis vs Aerospike
The table below outlines key technology differences between Aerospike 7.1 and Redis Enterprise 7.2.
Architecture
Valkey, an (open source Redis fork) is an in-memory data structure store.
Redis Enterprise is a commercially supported implementation of Redis that adds distributed database capabilities.
Redis is a single instance, single-threaded in-memory data structure store that can be used as a database, cache, message broker, and streaming engine. This approach delivers good performance on a single in-memory instance, but clustering is only supported with an add-on process called the Redis cluster proxy (from Redis Enterprise).
Redis 6.0 added multi-threading of network I/O, providing some parallel processing of data ingest. By default, multi-threading is turned off. This limited support of multi-threading excludes many of the advantages of multi-core processors.
A distributed NoSQL database. Designed for high-scale, high throughput, low latency transaction processing through its patented Hybrid Memory Architecture.
Aerospike is a distributed, multi-threaded database. It is engineered to get the most out of compute, network, and I/O resources.
Aerospike focuses on the minute details of CPU, shared memory, processor cache, and NVMe.
Its Hybrid Memory Architecture™ (HMA) enables the use of flash storage (SSD, PCIe, NVMe) in parallel to perform reads at sub-millisecond latencies at very high throughput (100K to 1M+ TPS), even under heavy write loads. This enables enormous vertical scaleup at a 5x lower total cost of ownership (TCO) than pure RAM.
Aerospike bypasses the operating system’s file system and directly utilizes a flash device as a block device using a custom data layout.
Aerospike uses multi-threading extensively to achieve maximum parallelism for all major functions and exploits the power of modern multi-core processors.
Implications
Aerospike’s natively distributed architecture, memory efficiency, storage flexibility and optimization, and extensive multi-threading combine to deliver maximum performance, scale, and throughput. Redis can function well as a single instance but has rudimentary multi-threading and high overhead for clustering.
Data models
Key-value plus document, time-series, vector
Redis is a key-value store where data is stored as key-value pairs. There is no notion of records, logical records, or bins. That means that any data operation must be performed on individual key-value pairs. Redis Enterprise supports additional data models, such as document, spatial, search, time series, and vector.
However, the limitations of Redis’ key-value implementation of different data models pose ongoing scalability and latency challenges.
Multi-model (key-value, document, graph)
Aerospike distributes and stores sets of records contained in namespaces (akin to “databases”). Each record has a key and named fields (“bins”). A bin can contain different types of data from the simple (e.g., integer, string) to the complex (e.g., nested data in a list of maps of sets).
This provides considerable schema flexibility.
Aerospike supports fast processing of Collection Data Types (CDTs) which contain any number of scalar data type elements and nesting elements such as lists and maps. Nested data types can be treated exactly as a document.
Aerospike’s structure enables users to model, store, and manage key-value data, JSON documents, and graph data with high performance at scale.
Implications
Aerospike’s support for bins enables developers to process multiple items within a bin (e.g., first name, last name, account balance) with a single transaction. In contrast, in Redis, this is three separate transactions, which hampers Redis’ performance and scalability in any real-world transactional scenario. Aerospike’s CDTs are the high-performance foundation for supporting document and graph data types.
Caching
Flexible caching in the cloud or on-prem
Redis works well as a cache because of in-memory performance. It is offered solely as a caching product on Azure (Azure Cache for Redis). AWS offers Amazon ElastiCache for Redis as a fully managed caching service.
LRU eviction relies on sampling based methods, which are less precise.
Easily configured as a high-speed cache (in-memory only)
Flexible configuration options enable Aerospike to act as
(1) a high-speed cache to an existing relational or non-relational data store to promote real-time data access and offload work from the back end
or
(2) an ultra-fast real-time data management platform with persistence.
Aerospike can store all data and indexes in DRAM, all data and indexes on SSDs (Flash), or a combination of the two (data on SSDs and indexes in DRAM).
7.1 adds precise LRU eviction, providing an efficient persistence of record read activity using a default read percent that will extend the record TTL accordingly.
Implications
Aerospike’s flexible deployment options include an in-memory cache and the ability to turn off persistence to gain performance. That flexibility enables firms to standardize on its platform for a wide range of applications, reducing the overall complexity of their data management infrastructures and avoiding the need to cross-train staff on multiple technologies.
Aerospike 7.1 enables consolidation of hundreds of legacy caching solutions into a single, cost-effective Aerospike cluster.
Many firms initially deploy Aerospike as a cache to promote real-time access to other systems of record or systems of engagement and later leverage Aerospike’s built-in persistence features to support additional applications.
Clustering
Single instances connected via Redis cluster proxy
Redis database RDB (persistence) is a single instance in-memory database. Redis Enterprise provides the Redis cluster proxy to make a group of RDB instances into a shared-nothing cluster.
When a client request comes in, it is up to the proxy to decide which server has the desired data, and it forwards the request to that server.
This adds significant latency to each transaction and is therefore a significant factor in scaling workloads.
Distributed database
Aerospike was designed from the outset as a distributed database. All nodes are aware of each other.
Aerospike features a Smart Client™ that automatically distributes both data and traffic to all the nodes in a cluster.
Automatic client load balancing improves both performance and correctness. This ensures a single hop to data for the lowest possible latencies.
Implications
RDB is a single instance in-memory database. To make Redis act like a distributed database, separate child processes for cluster proxy, persistence, replication, and consistency are required. Each of these processes competes for CPU, memory, cache, IO, and network, adding processing overhead and latency. Aerospike, by contrast, is a multi-threaded, highly performant distributed database with these capabilities natively developed.
Storage model
In-memory, optional persist to file system
Redis is an in-memory data store, meaning that data is not automatically stored in a non-volatile medium like SSDs. This means that should a Redis server fail, there is potential for data loss.
Redis offers optional persistence methods through database snapshotting or append-only files (AOF) in a file system on disk.
Snapshots (RDB) are problematic when data loss can’t be tolerated. If you are snapshotting your database every hour, a node failure could cost you an hour's worth of transactions. You can configure the AOF with a setting to minimize data loss, which negatively impacts performance and scalability.
Custom, high-performance format with storage engine choice
Designed as an operational distributed database, Aerospike employs a specialized log-structured file system that optimizes using flash drives (SSDs) as primary data storage without heavy dependence on RAM for performance.
Aerospike uses SSDs as raw devices, employing a proprietary log-structured file system rather than relying on file system, block, and page cache layers for I/O. This provides distinct performance and reliability advantages. Aerospike uses raw-device block writes and lightweight defragmentation.
Firms can choose from hybrid memory (indexes-only in DRAM and data on Flash), all DRAM (i.e. in-memory), all-flash, and as of Aerospike 7.1, support for NVMe-compatible, low-cost cloud block storage, and common enterprise networked attached storage (NAS).
Implications
Aerospike’s HMA approach leads to greater predictability and reliability without experiencing longer latencies that result from Redis’ coarse-grained approach to persistence. Delivering near-RAM levels of performance with SSDs means Aerospike clusters have fewer nodes. Clusters with fewer nodes have lower TCO, easier maintainability, and higher reliability.
Client access
Proxy model reroutes client requests
As described above, Redis was not designed as a distributed database. The Redis cluster proxy is an add-on process to the running RDB instance. Each RDB instance is separate and stand-alone.
Data sharded across a Redis “cluster” must first be routed to the correct node, and then the operation is executed on the RDB instance.
Smart Client knows where every data element is minimizing network “hops”
Aerospike clients (Smart Clients) are aware of the data distribution across the cluster; therefore, they can send their requests directly to the node responsible for storing the data. This reduces the network hops required, improving the database's performance.
Aerospike’s Smart Client layer maintains a dynamic partition map that identifies the primary node for each partition. This enables the client layer to route read or write requests directly to the correct nodes without any additional network hops.
Since Aerospike writes synchronously to all copies of the data, there is no delay for a quorum read across the cluster to get a consistent version of the data.
Implications
Aerospike’s Smart Client™ approach ensures a single hop to the data and reduces overall network traffic, making a significant positive impact on latency and performance. While each RDB node may be fast, relying on a separate cluster proxy adds latency and may require multiple hops to data, which negatively impacts scale.
Scalability options
Horizontal scaling is the only option, and it disruptively reshuffles most of the data
Redis’ largely single-threaded operation negates the benefits of multi-core processors, so scaling up is limited.
Scaling out with Redis (i.e., adding cluster nodes) is a labor-intensive process, where data must be resharded when nodes are added. Redis provides some automation of the resharding of data.
Vertical and horizontal scaling. Automatic data movement and automatic rebalancing when adding nodes
Aerospike handles massive customer growth without having to add many nodes based on its SSD-friendly Hybrid Memory Architecture and flexible configuration options.
Data distribution
Aerospike distributes data across cluster nodes automatically. When a node joins or leaves the cluster, the data is automatically redistributed.
Aerospike automatically shards data into 4,096 logical partitions evenly distributed across cluster nodes. When cluster nodes are added, partitions from other cluster nodes are automatically migrated to the new node, resulting in very little data movement.
Vertical scaling
Aerospike exploits SSDs, multi-core CPUs, and other hardware and networking technologies to scale vertically, making efficient use of these resources. You can scale by adding SSDs. (There is no theoretical upper limit for the amount of resources that can be added to a single node.)
Horizontal scaling
The data distribution is random; therefore, scaling an Aerospike cluster in and out results in less data movement. Also, Aerospike follows a peer-to-peer architecture, meaning that no node has a special role. In this architecture, the load gets equally distributed across all the cluster nodes.
Implications
For a new deployment, the Aerospike cluster will have fewer nodes and thus lower TCO, easier maintainability, and higher reliability. Additionally, when expanding existing deployments, Aerospike’s horizontal scaling is far less disruptive, with less risk of downtime and data loss.
Consistency
(CAP Theorem approach)High Availability (AP) mode only
Redis has supported a number of ad hoc replication mechanisms, but none guaranteed anything stronger than causal consistency. RedisRaft aims to bring strict serializability to Redis through the Raft consensus algorithm.
Redis supports eventual consistency and has invented consistency terms called “near strong consistency” and “strong eventual consistency.” Eventual consistency is weak consistency.
Redis documentation states: “Redis Cluster does not guarantee strong consistency. In practical terms, this means that under certain conditions, it is possible that Redis Cluster will lose writes that were acknowledged by the system to the client.”
RedisRaft is a Redis module that implements the Raft Consensus Algorithm, making it possible to create strongly consistent clusters of Redis servers. The Raft algorithm is provided by a standalone Raft library.
Both High Availability (AP) mode and Strong Consistency (CP) mode
Aerospike provides distinct high availability (AP) and (strong consistency) (CP) modes to support varying customer use cases.
The independent Jepsen testing in 2018 validated Aerospike’s claim of strong consistency. Strong consistency mode prevents stale reads, dirty reads, and data loss.
With strong consistency, each write can be configured for linearizability (provides a single linear view among all clients) or session consistency (an individual process sees the sequential set of updates).
Each read can be configured for linearizability, session consistency, allow replica reads (read from master or any replica of data), and allow unavailable responses (read from the master, any replica, or an unavailable partition).
Aerospike’s roster-based consistency algorithm requires only N+1 copies to handle N failures. Aerospike automatically detects and responds to many network and node failures to ensure high availability of data without requiring operator intervention.
High Availability (AP)/partition tolerant mode emphasizes data availability over consistency in failure scenarios.
Modes and consistency levels can be defined at the namespace level (database level).
Implications
While data consistency requirements vary among applications, having a data platform that can easily enforce strict consistency while maintaining strong runtime performance gives firms a distinct edge, enabling them to use one platform to satisfy a wider range of business needs.
Aerospike’s approach to data consistency enables firms to use its platform as a system of engagement or system of record without introducing application complexity or excessive runtime overhead.
Redis documentation explicitly states that it can’t guarantee strong consistency.
Fault tolerance
High availability managed with Redis Sentinel or Redis Cluster
Redis Sentinel enables you to monitor the state of your Redis cluster. It will alert you if a master node fails and helps you manage automated failover.
Sentinel is not scalable as it is not a clustering solution, so all writes go to the master. Also, it does not support sharding.
Redis Cluster is a clustering solution but does not have robust HA features or strong consistency.
Two replicas for High Availability. Automated failovers.
Aerospike users typically maintain replication factor two (RF2) (one primary, one replica copy) for high availability.
Aerospike automatically detects and responds to many network and node failures (“self-healing”) to ensure high availability of data, prevent data loss or performance degradations without requiring operator intervention.
Implications
Achieving high availability with fewer replicas reduces operational costs, hardware costs, and energy consumption. Automated recovery from common failures promotes 24x7 operations, helps firms achieve target SLAs, and reduces operational complexity.
Multi-site support
Primary-replica architecture
Redis has a primary-replica topology, which supports multiple replicas from a primary instance. Redis replicas are read-only, so all writes must be done on the primary, negatively impacting performance and scalability. Only asynchronous replication is supported.
Redis supports Redis Active-Active Geo-Distribution, which implements conflict-free replicated data types (CRDTs) that do not provide strong consistency. CRDTs are relatively unproven in enterprise production environments.
Automated data replication across multiple clusters; A single cluster can span multiple sites
Supports multi-site deployments for varied business purposes: Continuous operations, fast localized data access, disaster recovery, global transaction processing, edge-to-core computing, and more.
Asynchronous active-active replication (via Cross Datacenter Replication, XDR): is achieved in sub-millisecond or single-digit milliseconds. All or part of the data in two or more independent data centers gets replicated asynchronously. The replication can be one way or two ways.
The clients can read and write from the data center close to them.
The expected lag between data centers is in the order of a few milliseconds. Optimizations to minimize transfer of frequently updated data.
XDR also supports selective replication (i.e., data filtering) and performance optimizations to minimize transfer of frequently updated data.
Synchronous active-active replication (via multi-site clustering): A single cluster is formed across multiple data centers. Achieved in part via rack awareness, pegging primary and replica partitions to distinct data centers. Automatically enforces strong data consistency.
The clients can read data from the node close to them; the expected latency will be less than a millisecond. However, the write requests may need to be written in a different data center, which may increase the latency to a few hundred milliseconds.
Implications
Global enterprises require flexible strategies for operating across data centers. Aerospike supports both synchronous and asynchronous replication of data across multiple data centers in a variety of configurations. Firms can configure Aerospike clusters across sites, data centers, availability zones, regions, and even cloud providers simultaneously. This enables applications to customize deployments according to their resilience and availability needs.
Interoperability
(Ecosystem)A range of ready-made connectors available from third parties
ODBC/JDBC and SQL (CData) Spark-Redis (open source) Tanzu (VMware), Cloud Foundry, Nagios, and Trino (via Trino website).
Wide range of ready-made connectors available from Aerospike
Performance-optimized connectors for Aerospike are available for many popular open source and third-party offerings, including Kafka, Spark, Presto-Trino, JMS, Pulsar, Event Stream Processing (ESP), and Elasticsearch. These connectors, in turn, provide broader access to Aerospike from popular enterprise tools for business analytics, AI, event processing, and more.
Implications
Aerospike has built connectors to facilitate large-scale data streaming and processing whereas Redis has largely taken steps to help you deploy, by contrast.
Persistence options
Persistence is optional. It is done via snapshots or append-only files (AOFs). It can optionally be persisted by another database (e.g., RocksDB or Speedb)
Redis offers optional persistence methods through database snapshotting or append-only files (AOF) in a file system on disk.
Snapshots are problematic when data loss can’t be tolerated. If you are snapshotting your database every hour, a node failure could cost you an hour's worth of transactions. You can configure the AOF with a setting to minimize data loss, but this negatively impacts performance and scalability.
Persisting to another database such as rocksDB or Speedb adds overhead to Redis nodes or extra server nodes, adding to operating costs and latency.
Persist to SSD by default, non-persistence for in-memory/caching user cases, and a combination of SSD and memory persistence
Flexible configuration options enable Aerospike to act as
(1) an ultra-fast real-time data management platform with persistence.
(2) a high-speed cache to an existing relational or non-relational data store to promote real-time data access and offload work from the back end.
Aerospike can store all data and indexes in DRAM, all data and indexes on SSDs (Flash), or a combination of the two (data on SSDs and indexes in DRAM).
Implications
Aerospike’s flexible deployment options enable firms to standardize on its platform for a wide range of applications, reducing the overall complexity of their data management infrastructures and avoiding the need to cross-train staff on multiple technologies. Many firms initially deploy Aerospike as a cache to promote real-time access to other systems of record or systems of engagement and later leverage Aerospike’s built-in persistence features to support additional applications. Other systems’ persistence options cannot rival Aerospike’s performance, which rivals their in-memory performance.
Change Data Capture
Through Redis Data Integration (RDI) product
Redis Data Integration (RDI) creates a data streaming pipeline that mirrors data from an existing database to Redis Enterprise. It is a separate product/process from Redis Enterprise.
The Redis integrated Change Data Capture (CDC) capability tracks the updates in the command database transaction log, transforms row-level change data into a Redis data structure, and replicates the updates to the query database.
Integrated via change notifications with granular data options and automated batch shipments.
Change Data Capture (CDC) is an Aerospike cluster feature.
Granular options for capturing (and replicating) changed data, ranging from full namespaces (databases) to subsets of select records.
Aerospike logs minimal information about each change (not the full record), batching changed data and shipping only the latest version of a record.
Thus, multiple local writes for one record generate only one remote write – an important feature for “hot” data.
Aerospike also provides Change Data notifications to external systems, like Kafka or other databases.
Implications
Having integrated, optimized Change Data Capture in your database cluster maximizes efficiency. Providing CDC as an add-on feature introduces significant overhead and latency to CDC operations.
Multi-tenancy
Either via multi-instance deployment, containerization, or software multi-tenancy
Multi-tenancy relies on sharding as a method to isolate data in Redis. You can assign multiple shards to a database to meet any data set size or throughput requirements. You may enable persistence, replication, eviction policy, and flash as a RAM extension at the database level.
A shard is an open source Redis instance. Redis, being a single-threaded process, runs on one CPU core.
Various Aerospike server features enable effective multi-tenancy implementations
Aerospike’s key features for multi-tenancy are separate namespaces (databases), role-based access control in conjunction with sets (akin to RDBMS tables), operational rate quotas, and user-specified storage limits to cap data set size.
Implications
The Aerospike approach provides good isolation, whereas Redis themselves warn, “if a specific customer has requirements for data isolation or unique resource requirements, a single-tenant approach may be more suitable.”
Hardware optimization
Designed for commodity servers
Redis makes no claims of optimizations for hardware platforms. It relies on in-memory performance to deliver low-latency response.
Redis is predominantly single-threaded by design, though muti-threaded network I/O was added in Redis 6 (turned off by default).
Designed to exploit modern hardware and networking technologies
Aerospike is designed and implemented explicitly to exploit advances in modern hardware to maximize runtime performance and cost efficiency.
Aerospike is massively multi-threaded to get the most from today’s multi-core processors.
NVMe, Flash, and SSDs are treated as raw block devices to reduce I/O for the lowest latency, avoiding overhead from standard storage drivers and file systems. Aerospike data structures are partitioned with fine-grained locks to avoid memory contention for more efficient use of multi-core CPUs. Application Device Queues (ADQs) are used with certain networking devices to reduce context switching and keep data in local processor caches.
Implications
Aerospike is designed to minimize latency with comprehensive optimizations on multiple levels. Extensive multi-threading and efficient NVMe operations means that Aerospike gets the maximum performance from modern server hardware.
Clusters can manage more aggressive workloads and higher data volumes with fewer nodes than the equivalent Redis cluster, reducing operational complexity and TCO.