Skip to main content
Loading

Certifying Flash Devices (SSDs)

The Aerospike database is optimized to run on Flash and SSD devices, and is capable - through Hybrid Memory index use, and direct device access - of providing high throughput and low latency on Flash.

Over the years, we have found the exact characteristics of different Flash devices are of crucial interest to Aerospike customers. The most important device characteristics are not captured by any specification sheet - the performance of read latency under sustained write load.

Of course, Aerospike can be run in a variety of other configurations, including pure in-memory mode or memory backed with devices for persistence.

For this purpose, we wrote the Aerospike Certification Tool (ACT), and started gathering and publishing results. The source code is available on GitHub, allowing device manufacturers to include ACT in their internal engineering. Both the justification and source code remain unchanged.

This page transmits the results we have observed and collected. We update this page with new drives, we archive information for drives that are no longer available. With these results, you will have a very specific result to a question of "whether this drive will work in my environment" - although your environment might be different from the data points we have collected.

A profile has three fundamental characteristics:

  • Read / write ratio
  • Object size
  • Latency requirements of read operations (detailed below)

The test is run by increasing throughput and determining whether the latency requirement is met --- and increasing throughput again, if the device is within latency requirements, until the latency SLA is no longer met.

Be careful using these results if your intended workload varies from a published workload. We make available the tools to allow you to test your intended workload, and testing may be required in your individual case.

Finally, we have also found the results themselves are only a starting point for making a purchase decision.

Two fundamental factors must be considered: the price of the device, and the wear rating (DWPD).

A device that wears fast may not be a bargain in a high-write environment.

A slow but very inexpensive device may outperform a faster but more expensive device --- by purchasing more drives. As individuals have different available discounts, you may need to explore pricing of different drives before making a final decision.

An important calculation must be made: speed at a given capacity. It should not be assumed that a larger drive is faster. In general more Flash capacity should result in a faster drive, but internal controller and datapath bottlenecks will present themselves. For many manufacturers, 3.2TB drives are the "sweet spot" of performance, but we are starting to see 7.0TB and larger drives be the maximum performance per price ratio. We have performance measurements for some drives in different capacities.

While we make our raw data and findings available, Aerospike's solutions architects are available to help you size your system. With our years of experience helping deploy clusters both large and small, we'll help you through the process of choosing your next set of hardware.

Guide to these numbersā€‹

Aerospike allows manufacturer-supplied numbers. In those cases, we receive log files from a manufacturer, and do our best to validate. The cases where we have received values from a manufacturer, we will detail the configuration information they have supplied, and are often also supplied drives, which we are in process of testing. In the appendix, we will detail what the manufacturer has told us about test environment.

Aerospike updates ACT periodically to track changes in Aerospike Database storage I/O algorithms. The core case, properly configured, produces very similar results regardless of version. However, the version used is included in test results for completeness. The current version of ACT is 6.4, released in November 2023.

Values for endurance are taken from manufacturer's spec sheets and have not been independently verified.

Operational Database Devicesā€‹

Aerospike is often used as an operational database for transactions and user data.

This ACT workload has the following characteristics:

  • 66 % reads to 33 % writes ratio
  • 1.5KB object size
  • Max latency
    • < 5.0% exceed 1 ms
    • < 1.0% exceed 8 ms
    • < 0.1% exceed 64 ms

The tables below show the results from the Aerospike Certification Tool (ACT) on some popular flash devices that we have tested and that our customers are using in production with the Aerospike database. The test results are split into 6 different categories:

To read the results of the test, the devices are run at a constant rate over a period of 24 hours. This is done to remove the effect of any caching and represents the long-term (rather than burst) performance of the disk. The steady state results - which represent the latency histogram after a number of hours and where the results appear to be stable after a number of hours, appear below.

Each column represents a time threshold in milliseconds. The numbers beneath are the percentage of transactions that exceed that threshold. For instance, in the table below, the Intel DC s3710 had 1.6% of transactions in excess of 1 millisecond.

These criteria may be relaxed by users depending on the need. In some cases, a manufacturer may have sacrificed a little performance to achieve greater consistency.

Your results may vary from these published numbers due to differences in the server, differences in the RAID controller, or even variations between devices of the same model.

PCIe/NVMe-based 3D XPointā€‹

3DXpoint is a next-generation memory technology that is positioned between DRAM and flash. Practical devices have been brought to market by Intel under the Optane brand. These devices should be considered when low latency under sustained write load is required, as average latency is substantially lower under write load.

While transactions per second are high, the extraordinary low latency can easily be compared to NAND Flash drives below.

These devices were tested at the specified speed with a 67% read/33% write ratio of 1.5 KB objects over 24 hours.

3D XPoint NVMe DevicePCIeTrans/sec>256Āµs>512Āµs>1ms>8ms>64msEnduranceACTSource
Intel P5800X 1.6 TB4.02,010,00011.11%0.29%0.00%0.00%0.00%100 DWPD6.1Intel
Micron X100 750 GB3.01,400,0000.00%0.00%0.00%100 DWPD6.1Micron
Intel SSD DC P4800X 375G3.0435,0000.10%0.01%0.00%30 DWPD3.1Aerospike
Intel SSD DC P4800X 750G3.0435,0000.06%0.00%0.00%30 DWPD4Aerospike

PCIe/NVMe-Based Flashā€‹

These devices were tested at the specified speed with a 67% read/33% write ratio of 1.5 KB objects over 24 hours.

Flash DeviceSpeed (tps)>1ms>8ms>64msEnduranceNotes*ACTSource
WD Ultrastar DC SN861 7.68 TB1,506,0000.73%0.00%0.00%1 DWPDCP=1006.4Western Digital
Smart IOPS Data Engine 12.8TB1,080,0003.94%0.00%0.00%3 DWPD6.1Aerospike
Smart IOPS DataEngine 6.4 TB825,0000.30%0.00%0.00%3 DWPD3Smart IOPS
ScaleFlux CSD 3000 7.68 TB801,0003.29%0.03%0.00%2 DWPDCP=406.2ScaleFlux
Micron 9400 Max 12.8 TB750,0004.30%0.00%0.000%3 DWPD6.2Micron
Smart IOPS DataEngine 3.2 TB630,0001.50%0.01%0.00%3 DWPD6.2Smart IOPS
WD Ultrastar DC SN840 3.2 TB570,0002.29%0.00%0.00%3 DWPD6.1Western Digital
ScaleFlux CSD 2000 3.2 TB531,0004.82%0.07%0.00%5 DWPDCP=505.3Aerospike
Micron 9300 Max 6.4 TB525,0004.45%0.06%0.005%3 DWPD5.2Micron
WD Ultrastar DC SN655 15.36 TB516,0004.96%0.00%0.00%1 DWPD6.4Western Digital
Intel P5510 7.68 TB480,0004.65%0.02%0.00%1 DWPDOP=106.2Aerospike
Kioxia CM6-v 3.2 TB480,0004.94%0.00%0.00%3 DWPD6.1Aerospike
WD SN650 15.36 TB465,0004.48%0.00%0.00%1 DWPDCP=1006.2Western Digital
Kioxia CM6-v 1.6 TB420,0005.00%0.00%0.00%3 DWPD6.1Aerospike
Micron 9300 Max 3.2 TB354,0004.86%0.19%0.0003 DWPD5.2Aerospike
Huawei ES3600P V5 3.2 TB384,0002.72%0.02%0.00%3 DWPD4Aerospike
WD Ultrastar DC SN200 3.2 TB336,0003.14%0.00%0.00%3 DWPD5.2Aerospike
ScaleFlux CSS1000 6.4 TB324,0004.12%0.00%0.00%5 DWPD3Aerospike
Micron 9200 Max 1.6 TB316,5004.64%0.13%0.00%3 DWPD3Aerospike
Intel P4610 1.6 TB300,0004.56%0.85%0.00%3 DWPD4Aerospike
Intel P4610 6.4 TB300,0004.98%0.14%0.00%3 DWPD4Aerospike
ScaleFlux CSS1000 3.2 TB300,0004.59%0.00%0.00%5 DWPD4ScaleFlux
Micron 9200 Pro 3.84 TB283,5004.54%0.13%0.00%1 DWPD3Micron
WD Ultrastar DC SN640 3.2 TB240,0004.77%0.00%0.00%2 DWPD6.1WD
Huawei ES3600P V5 1.6 TB180,0004.47%0.25%0.00%3 DWPD4Aerospike
Toshiba CM5 3.2 TB150,0003.70%0.00%0.00%3 DWPD3Toshiba
Toshiba PX04PMB320 3.2 TB135,5004.73%0.00%0.00%10 DWPD3Toshiba
Intel P4510 1 TB120,0004.31%0.00%0.00%1 DWPD4Aerospike
Samsung PM983 1.92 TB108,0000.09%0.00%0.00%1.3 DWPD3.1Aerospike
info

* OP=n means the SSD was over-provisioned by n%.

** CP=n means the ACT compress-pct parameter was set to n%, which causes record data to be compressible (by default record data is incompressible). Some SSDs can compress/decompress data on-the-fly as it is being written/read.

SATA/SAS-Based Flashā€‹

The flash device sizes listed below are after any over-provisioning that was done to the drive (by setting Host Protected Area using hdparm).

These devices were tested at the specified speed with a 67% read/33% write ratio of 1.5 KB objects over 24 hours. Any flash devices marked in red are not recommended.

Flash DeviceSpeed (tps)>1ms>8ms>64msEnduranceACTSource
Micron 5300 MAX 3.8 TB30,0004.930.02%0.0%3.5 DWPD5.2Micron
Micron 5300 PRO 7.6 TB33,0004.550.06%0.0%1.5 DWPD5.2Micron

M.2-Based Flashā€‹

M.2 is a standard for SSDs that was introduced in 2013. M.2's small form factor allows for a much higher density of storage at a more moderate price.

These devices were tested at the specified speed with a 67% read/33% write ratio of 1.5 KB objects over 24 hours.

Flash DeviceSpeed (tps)>1ms>8ms>64ms
LiteOn EP1-KB480 (OP to 400 GB)9,0004.05%0.01%0.00%

Networked Storageā€‹

Storage technology is continually evolving to provide innovative solutions for the demands of databases like Aerospike. Networked (or disaggregated) storage utilizes network protocols instead of dedicated I/O channels to transport data between the host and storage device. Modern protocols such as NVMe and fast networks like 100Gbit ethernet can provide comparable throughput and latency.

This section is dedicated to providing ACT results for networked storage solutions. Depending on the use case and SLA requirements these solutions may provide the right performance, efficiency, and cost for your deployment.

Storage SolutionSpeed (tps)>1ms>8ms>64msDeviceProtocolACTSource
Western Digital1,158,0004.49%0.00%0.00%OpenFlex F-SeriesNVMe-oF5.3Western Digital
Drivescale Composer300,0003.85%0.01%0.00%HGST Ultrastar SN200 3.2 TBiSCSI4Drivescale
Drivescale Composer300,0002.60%0.00%0.00%HGST Ultrastar SN200 3.2 TBRoCEv24Drivescale
Toshiba Kumoscale150,0003.97%0.01%0.00%Toshiba CM5 3.2 TBRoCEv23.1Toshiba

Cloud-Based Flashā€‹

Cloud instances with attached NVMe flash devices have become common. This section includes a representative sample of ACT results performed by Aerospike on the infrastructure of major cloud providers. Although the methodology is the same, the results need to be interpreted differently. Cloud instances are subject to additional sources of variability not found in a lab setting with fixed hardware. These include "noisy neighbors" (unrelated instances running on the same physical server) and network congestion (again involving unrelated traffic).

Another source of variability across ACT runs is that the hardware may be slightly different across instances in the same class, and NVMe devices in particular may be newer or older, have different firmware revisions, or possibly be different models.

At a minimum, ACT results should be interpreted more conservatively: results from a given run may be better or worse than the long-term average, and that average may not be stationary. To the extent that time and budget allow, averaging multiple runs (possibly at different times of day or across regions) will produce a better estimate.

The results below include tests performed on both single and multiple flash devices. The Storage column shows the number of devices used by ACT, which may be less than the number available on that instance. Multiple flash devices attached to an instance are useful for measuring the linearity of throughput. The recorded times are for the best throughput.

Except as noted, these devices were tested at the specified speed with a 67% read/33% write ratio of 1.5 KB objects over 24 hours.

Amazon Web Services (AWS) Instancesā€‹

Instance NameSole TenantSpeed (tps)>1ms>8ms>64msACTStorage
c5d.24xlargeN564,0004.25%0.00%0.00%V6.14 x 900
r5d.24xlargeN552,0003.23%0.00%0.00%V6.14 x 900
m5d.24xlargeN432,0001.67%0.00%0.00%V6.14 x 900
r5d.4xlargeN354,0000.38%0.21%0.00%V6.12 x 300
i3en.2xlargeN300,0004.93%0.06%0.00%V6.12 x 2500
c5d.24xlargeN282,0001.59%0.00%0.00%V6.12 x 900
r5d.24xlargeN276,0001.38%0.00%0.00%V6.12 x 900
m5d.24xlargeN216,0005.33%0.00%0.00%V6.12 x 900
c5d.4xlargeN207,0008.36%0.07%0.00%V6.11 x 400
r5d.4xlargeN177,0007.25%0.00%0.00%V6.11 x 300
m5ad.4xlargeN174,0004.63%0.00%0.00%V6.01 x 300
r5ad.4xlargeN171,0000.81%0.00%0.00%V6.01 x 300
i3en.12xlargeN162,0003.46%0.00%0.00%V6.11 x 7500
i3en.2xlargeN150,0002.21%0.03%0.00%V6.11 x 2500
c5d.24xlargeN141,0004.78%0.00%0.00%V6.11 x 900
r5d.24xlargeN138,0004.95%0.00%0.00%V6.11 x 900
r5ad.24xlarge*N132,0004.80%0.00%0.00%V6.01 x 900
m5ad.24xlargeN129,0004.62%0.00%0.00%V6.01 x 900
c5d.18xlargeN114,0002.26%0.00%0.00%V6.01 x 900
m5d.24xlargeN108,00012.47%0.00%0.00%V6.11 x 900
c5d.9xlargeN90,0001.30%0.00%0.00%V6.01 x 900
r5d.12xlargeN90,0000.98%0.00%0.00%1 x 900
m5d.4xlargeN57,0000.22%0.00%0.00%V6.11 x 300
i3.8xlargeN33,0004.55%0.00%0.00%V6.11 x 1900
i3.2xlargeN12,0001.24%0.00%0.00%V6.01 x 1900
i3.metalN12,0001.06%0.00%0.00%V6.01 x 1900
info

* Test duration was 12 hours.

caution

Cloud-based flash devices show great variability. We recommend that you test each new instance. The best numbers tested for a single flash device are provided above.

Google Cloud Platform (GCP) Instances**ā€‹

Instance NameSole TenantSpeed (tps)>1ms>8ms>64msACTStorage
n2-standard-80Y6,408,0000.62%0.03%0.00%V6.124 x 375
n2-standard-80Y4,272,0001.02%0.03%0.00%V6.116 x 375
n2-standard-80Y2,136,0003.27%0.03%0.00%V6.18 x 375
n2-standard-80Y1,308,0004.96%0.45%0.05%V6.14 x 375
n2-standard-80N1,068,0000.17%0.00%0.00%V6.14 x 375
n2-standard-80N801,0000.10%0.00%0.00%V6.13 x 375
n2-standard-16N768,0001.87%0.02%0.00%V6.14 x 375
n2-standard-80N534,0002.57%0.00%0.00%V6.12 x 375
n2-standard-16N384,0003.11%0.01%0.00%V6.12 x 375
n2-standard-80N267,0004.62%0.01%0.00%V6.11 x 375
n2-standard-64N267,0004.78%0.01%0.00%V6.11 x 375
n2-standard-32N192,0001.48%0.03%0.00%V6.11 x 375
n2-standard-16N192,0007.26%0.00%0.00%V6.11 x 375
info

** Google local SSDs are independent of instance type. These results are 'gen2' drives which are available primarily on Intel Cascade Lake or later, with linear performance gain through 4 devices per instance.

Microsoft Azure Instancesā€‹

Instance NameSole TenantSpeed (tps)>1ms>8ms>64msACTStorage
L16s v2N15,0003.37%0.10%0.00%V5.01 x 160

Historical Flash Devicesā€‹

The devices listed below are older (deemed "historical"). Therefore, we don't recommend they be used.

SATA/SAS-Based Flash: The flash device sizes listed in the table below are after any overprovisioning that was done to the drive (by setting Host Protected Area using hdparm). These devices were tested at the specified speed with a 67% read/33% write ratio of 1.5 KB objects over 24 hours.

Flash DeviceSpeed (tps)>1ms>8ms>64ms
Intel DC s3700 + OP (318GB)18,0001.60%0.00%0.00%
Samsung 843T + OP (370GB)9,0002.31%0.00%0.00%
Micron M500DC 480 GB + OP (300GB)9,0002.95%0.20%0.02%
Seagate 600 Pro + OP (240GB)9,0005.52%0.00%0.00%
Intel DC s3500 + OP (240GB)9,0008.12%0.00%0.00%
Micron P400e (200 GB)9,00012.30%4.86%3.89%

PCIe/NVMe-Based Flash: The performance numbers below are to the highest levels passed. None of these drives were over-provisioned. These devices were tested at the specified speed with a 67% read/33% write ratio of 1.5 KB objects over 24 hours.

Flash DeviceSpeed (tps)>1ms>8ms>64ms
Micron P320h 700GB450,0003.32%0.04%0.00%
Intel P3700 400 GB *210,0002.20%0.18%0.00%
Samsung SM1715 3.2 TB192,0003.68%0.00%0.00%
Micron P420m 1400 GB96,0003.21%0.00%0.00%
Intel P360884,0004.37%0.00%0.00%
Huawei es3000 2400 GB72,0001.18%0.00%0.00%
info

* These tests were run on a Dell R720 with dual E5-2690v2 @ 3GHz using RedHat 6.5 kernel 2.6.0.

Cloud-Based Flash: The performance of devices on cloud instances that might not be available for provisioning any longer, or have better alternatives now.

Cloud ProviderInstance NameSpeed (tps)>1ms>8ms>64ms
AzureL8s*30,0001.26%0.05%0.00%
AzureGS230,0001.30%0.21%0.13%
Amazonr3.4xlarge18,0003.74%0.01%0.00%
Amazonm3.xlarge (HVM)18,0002.49%0.00%0.00%
Amazonr3.2xlarge15,0004.71%0.10%0.02%
Amazonr3.xlarge15,0004.90%0.45%0.35%
Amazonr3.large12,0004.12%0.24%0.00%
Amazonc3.2xlarge9,0002.10%0.10%0.01%
Amazonm3.xlarge9,0004.14%0.23%0.01%
RackspacePerformance 19,0001.14%0.00%0.00%
info

* Azure Ls performance is for the single drive on the instance. The Azure Lsv2 instances have multiple drives equivalent to 5x rating each, so in aggregate they perform much better than the first generation Ls instances.