Build full-text search applications on Aerospike using Elasticsearch
Many of our customers have expressed a strong desire for increased access to real-time data within Aerospike for a wider range of stakeholders within their organizations, including data architects, analysts, and scientists. This enables them to harness greater value from their data and develop new applications and use cases that benefit their customers and their business. In response to this, we previously announced the release of Aerospike SQL in partnership with Starburst, providing SQL access to Aerospike data. Furthering that commitment to our customers, today we’re excited to announce the latest addition to the Aerospike Connect product suite: Aerospike Connect for Elasticsearch, to allow developers to build search applications for their real-time data.
This new connector enables full-text search capabilities on data stored within Aerospike, complementing the existing query capabilities of Aerospike Expressions and secondary index query enhancements introduced in Aerospike Database 6. This now offers our customers a diverse and complete range of query and search capabilities to choose from for their needs. We’re excited to enable the innovations and use cases that result from utilizing this enhanced functionality.
Aerospike Connect for Elasticsearch enables customers to seamlessly leverage the high performance and scalability of Aerospike with the powerful search and analytics capabilities of Elasticsearch. It obviates the need to build complicated tech stacks for heavy ETL processes or integrate with Pub-Sub systems to move data from Aerospike to Elasticsearch in order to leverage these technologies.
The connector provides for more efficient usage of resources as it does not require large amounts of memory to be used for indexing and searching on Elasticsearch. The application developer can easily configure which bins (recall “bins” in Aerospike are akin to columns for RDBMSs) in a record should be shipped to Elasticsearch.
What is Aerospike Connect for Elasticsearch?
Aerospike Connect for Elasticsearch provides a seamless and low-latency integration between Aerospike and Elasticsearch. It allows customers building search applications to take advantage of Elasticsearch for its full-text search and other advanced querying capabilities coupled with the high performance, low latency, high availability, extreme scalability and lowest total cost of ownership of Aerospike.
To perform this, the connector leverages Aerospike’s Cross Datacenter Replication (XDR) and its Change Notification – Aerospike’s Change Data Capture mechanism – services.
The connector subscribes to Change Notifications for mutations to data and makes it available in an Elasticsearch index in near real-time. Additionally, the fine replication granularity of XDR allows developers to choose the namespace, set, or bin to replicate in Elasticsearch depending on requirements around data availability, governance, or resource usage. For instance, by configuring only the searchable bins in a record to be shipped to an Elasticsearch index, you can efficiently use memory resources for your Elasticsearch cluster.
Our detailed technical documentation covers all this and more, including the usage and configurability of the connector.
The image below illustrates the deployment architecture.
Figure 1. Aerospike Connect for Elasticsearch deployment architecture
Common use cases
Here are a few common use cases where using Elasticsearch and Aerospike together can provide significant benefits:
E-commerce: In e-commerce, high-performance and low-latency search is critical. Elasticsearch can be used to provide powerful search and faceted navigation capabilities, while Aerospike can be used to store and retrieve the underlying product data.
Real-time analytics: Aerospike’s high performance and low-latency capabilities make it well-suited for real-time data processing, while Elasticsearch can be used for long-term data analytics and reporting. The two systems can be integrated to provide real-time insights and historical analysis of the data.
IoT and sensor data: As IoT devices produce huge amounts of data, storing it and querying it efficiently is crucial. Aerospike can be used to store time series sensor data and perform real-time aggregation and querying. Elasticsearch can be used to index and search the data, enabling more advanced querying capabilities. Together they can provide a powerful solution while maintaining low TCO, resiliency, and scalability.
Log analysis: Another use case is log analysis and visualization, Elasticsearch is often used to analyze log data and Kibana can be used to visualize the results. However, storing large volumes of log data on Elasticsearch can be expensive and slow. Aerospike can be used as a storage backend for Logstash and the logs can be indexed and searched using Elasticsearch.
In Closing
The release of Aerospike Connect for Elasticsearch is part of the transformation of Aerospike from a database to a data platform. It builds on our existing portfolio of connectors to essential data pipeline components such as Kafka, Pulsar, JMS, Spark, and Trino. It also complements our already powerful query capabilities underpinned by our highly parallelized secondary indexes and precise Expressions.
An ever-increasing proportion of data is going to be real-time data – projected by IDC to be 30% of all data by 2025. There will correspondingly be an increasing need to analyze, query, and search that data. Aerospike Connect for Elasticsearch enables scalable, high-performance search on your high-value data that resides in the Aerospike Real-time Data Platform.