Blog

Leveraging a real-time document database for large-scale applications

Author's photo
Apoorva Anupindi
Sr. Campaign Marketing Manager
October 27, 2022|7 min read

Aerospike joined a recent webinar with DBTA titled Building a real-time document data store at scale. Noel Yuhanna, VP & Principal Analyst at Forrester, George Demarest, Director of Product Marketing at Aerospike, and Mario Ornelas, CEO of Zonetap, discussed the factors affecting the performance and deployment of large-scale document stores. Ornelas also shared a real-world use case from Zonetap that combines IoT, geospatial, and a real-time document database to keep construction workers safe in hazardous environments.

Kicking off the webinar, Noel Yuhanna, VP & Principal Analyst at Forrester, states that digital transformation continues to be a top priority for global enterprises – and that organizations need to be able to leverage document data platforms in a more strategic manner. Document data stores have become a critical asset for the success of a business in that they enable innovative personalized customer experiences, a better understanding of user behavior, and deliver new revenue-generating products and services.

Yuhanna shares that one of the major trends with data and analytics that Forrester has observed includes a need to support real-time applications and insights. “We believe 20% of data today in an organization is called fast data or real-time data, and that number is going to double in the next three to five years.” Other trends include:

  • Focus on self-service platforms to accelerate deployments

  • Need to support all kinds of data, not just structured data

  • Demand for a more elastically scalable platform

  • Increasing demand for more connected data

  • Leveraging the cloud for new apps and insights

One of the most prominent trends around data has been the emergence of the document database, and Yuhanna explains that this is filling the gap to support a new generation of applications and insights.

Capabilities & benefits of document databases

Document databases provide a flexible data model that supports documents more efficiently since documents can be variable across business operations. This flexibility is helpful for addressing new business needs.

Document databases can deliver high performance on reads and writes for documents, which is especially important when you’re doing thousands of reads per second. Or, if you have millions and billions of documents in a database, you need to be able to scale up, maintain high performance, and support dynamic workloads. This also lowers data management costs because you don’t need large, complex environments, and tuning, backup, and recovery are all automated, enabling higher productivity.

Yuhanna suggests a few different criteria when considering a document database:

  • Performance to deliver low latency access

  • Scalability to support large deployments

  • Self-service and automation for simplified administration

  • Supports real-time data initiatives

  • Supporting APIs for a new generation of microservices

  • Data security and encryption built in from the beginning

Aerospike scale

George Demarest, Director of Product Marketing at Aerospike explains that the increased focus on document data is resulting in an unprecedented amount of unstructured data. “80% of today’s data is unstructured and it’s growing at a crazy 60% rate,” says Demarest, “and the ability to analyze that presents a big opportunity…” JSON has become the de facto data model for the web, making it more prevalent in document databases, which is why Aerospike offers support for JSON document models with Database 6.

80percent-unstructured-data-document-db

Demarest then dives into Aerospike’s focus on scale and low latency, sharing specific customer examples to show what scale really means for Aerospike. AppNexus, an internet technology company that does real-time sales and purchasing of digital advertising, does about 10 billion (plus or minus 1B) transactions per day. Compare that to a total of about 5 billion transactions a month for the NYSE, Nasdaq, and Visa combined. They also have a 100% uptime requirement, all powered by Aerospike.

The Trade Desk, the world’s largest independent programmatic advertising DSP, is another key example. With 800 billion queries per day at a less than 8 millisecond response time, they average 2-4 million transactions per second, demonstrating an immensely high throughput at a large scale.

A particular AdTech customer had strict high throughput, high transaction environments and wanted to create a unified data platform to support use cases across existing applications at their parent organization. They wanted to serve up user profiles in a JSON format for their ad targeting programs and move to a real-time interactive format. With a single document data store, they could reduce the complexity of their environment, and monetize their data better. They ran into scalability issues with data volumes of 9 terabytes. Aerospike was able to perform at 5x the throughput of Mongo with half the hardware.

Zonetap hitting the limits of MongoDB

Zonetap also faced a scalability problem, but a different kind of problem.Mario Ornelas is Chief Executive Officer at Zonetap, a company that offers a geofencing application designed to protect construction workers in dangerous environments. Ornelas cites OSHA, stating that about 17% of people die on a work site because they’re not seen. Zonetap has developed a device that’s worn by workers and also placed on heavy machinery in order to provide a real-time alert to the wearer that there is heavy machinery nearby or that there is danger around the work site.

An IoT device like this requires near-perfect accuracy and the system cannot afford to create false positives. Zonetap’s solution, called the 2ND-SKN is an IoT system designed with 2 inches (1.2 centimeters) of accuracy. The wearer receives an audible and visible alert, along with haptic feedback, to notify people by any means necessary that they are in the path of danger such as a moving vehicle or in a restricted access zone.

The core operation of the application is software that is designed to handle fast transactions for low latency alerting from the device to the database – low latency being the operative word here as what the system really requires is sub-second alerting. The 2ND-SKN IoT system uploads data four times a second with 20-100 bytes per cycle, which adds up very quickly and transactions need to be managed instantly.“

We have to be able to provide an alert that’s effective and efficient for operation of safety at these work sites,” says Ornelas.Their system is an integrated protective ecosystem on a cloud-based platform that collects and manages all the real-time data from the wearable device and then uses that data to assist decision-makers through analytical reporting tools. They can also do device playback, which can help organizations minimize costs and prove insurance claims as needed.In the webinar, Ornelas shares a very cool video demonstration to show how the system works.

Zonetap on Aerospike’s document store

Zonetap started their journey using MongoDB. However, once they added 24 IoT devices to their system, they experienced unpredictable performance and MongoDB couldn’t handle their transactions. They experienced interference, delays in transactions, and higher costs due to more resources. The company aimed to scale to thousands of devices and needed a new solution, which brought them to Aerospike’s document database capabilities, JSON Document API, and high throughput capabilities. With Aerospike, they expect to ingest location data for 1000 devices today and deliver real-time reads for seamless user experience with real-time map application.

Zonetap’s solution is currently in production and they will be releasing the first commercial version of their product at the end of this year.

Watch the full webinar here, learn more about Aerospike Document Database capabilities, and visit us at AWS re:Invent booth #3835 for a demonstration.

Leveraging real-time IoT data with a document database

Read case study