We are excited to be a part of AWS re:Invent 2024. Visit us at booth #1844 in Las Vegas.More info
Blog

Analyze Data with Aerospike and Starburst Anywhere

headshot-Yevgeny-Rizhkov-220x220-1-150x150
Yevgeny Rizhkov
Senior Software Engineer
June 2, 2021|3 min read

Starburst Enterprise, based on open source Trino (formerly PrestoSQL) is a SQL-based MPP query engine. It enables you to run Trino on a single machine, a cluster of machines, on-prem or in the cloud. Further, Starburst Galaxy offers a SaaS experience.

Aerospike is a fast NoSQL database, which is extensively used for low latency and high throughput use cases in Ad Tech, Financial services, Telco, and several other verticals. With its hybrid memory architecture, it serves as an ideal database for data analysis. In addition to supporting on-prem deployments, Aerospike is also offered in the cloud as a managed service or self-managed via Kubernetes Operator to make database-related operations frictionless and flexible.

We recognize that SQL is the lingua franca for data analysis. Hence, we launched Aerospike Connect for Presto earlier this year to empower data and business analysts to pursue their data analysis projects. We recently released the Dockerized Presto Connector to streamline the deployment of Trino and the Presto Connector. We are now pleased to announce that we have extended that to support Starburst Enterprise. The Presto connector seamlessly bridges the Aerospike Enterprise Edition (EE) and Starburst Enterprise. It allows you to leverage the scalability, speed, reliability, and TCO benefits of Aerospike, while leveraging the speed, massive parallelism, and support for Trino that Starburst Enterprise offers.

Why does it matter to you?

We have heard time and again from our users that Presto is hard to set up and tune for optimal performance and scale. A sad but true fact is that the time spent to deal with infrastructure issues is the time not spent on issues that matter most to your business. With the product offerings from Aerospike and Starburst along with their best in class support, users can now focus on generating valuable insights at scale from the data stored in Aerospike using ANSI SQL to drive their critical business decisions.

How to get started?

Here are a few steps that you can follow to get it going:

Step 1 - Build the Starburst Trino image along with the Aerospike Presto Connector

step1-6d26978295addeadd17fff93a66fc610

Step 2 - Run the Aerospike Database and load the data

  1. Download and run the Aerospike Database Enterprise Edition

  2. Load data into Aerospike (you would need to download the Aerospike Python client)

Step 3 - Run the docker image that you created in Step 1

step3-205cb732ec536e2225dc27b8c4b98d82

Step 4 - Query using Trino CLI

Download the Trino CLI and run the following command:

step4a-c6976264ee2dc3204dbe57e15127c599
step4a-c6976264ee2dc3204dbe57e15127c599
step4d-1c9d8d771f528f35ad44991bfb0655ac

The above will launch the image in standalone mode i.e. coordinator and worker running on the same host, but you can easily extend it to the distributed mode for production purposes as described here. Alternatively, you can use a Trino supported BI tool such as Tableau or Jupyter notebook as well as using the PyHive library.

In conclusion, with the deployment options and the best in class support offered by Aerospike and Starburst, you can go from zero to insights to meet your business needs. The time that you would otherwise spend dealing with infrastructure issues can now be spent on tasks that matter most to your business.