Analyze Data with Aerospike and Starburst anywhere

Yevgeny Rizhkov, Senior Software Engineer Blog, Developer

Starburst Enterprise, based on open source Trino (formerly PrestoSQL) is a SQL-based MPP query engine. It enables you to run Trino on a single machine, a cluster of machines, on-prem or in the cloud. Further, Starburst Galaxy offers a SaaS experience.

Aerospike is a fast NoSQL database, which is extensively used for low latency and high throughput use cases in Ad Tech, Financial services, Telco, and several other verticals. With its hybrid memory architecture, it serves as an ideal database for data analysis. In addition to supporting on-prem deployments, Aerospike is also offered in the cloud as a managed service or self-managed via Kubernetes Operator to make database-related operations frictionless and flexible.

We recognize that SQL is the lingua franca for data analysis. Hence, we launched Aerospike Connect for Presto earlier this year to empower data and business analysts to pursue their data analysis projects. We recently released the Dockerized Presto Connector to streamline the deployment of Trino and the Presto Connector. We are now pleased to announce that we have extended that to support Starburst Enterprise. The Presto connector seamlessly bridges the Aerospike Enterprise Edition (EE) and Starburst Enterprise. It allows you to leverage the scalability, speed, reliability, and TCO benefits of Aerospike, while leveraging the speed, massive parallelism, and support for Trino that Starburst Enterprise offers.

Why does it matter to you?

We have heard time and again from our users that Presto is hard to set up and tune for optimal performance and scale. A sad but true fact is that the time spent to deal with infrastructure issues is the time not spent on issues that matter most to your business. With the product offerings from Aerospike and Starburst along with their best in class support, users can now focus on generating valuable insights at scale from the data stored in Aerospike using ANSI SQL to drive their critical business decisions.

How to get started?

Here are a few steps that you can follow to get it going:

Step 1 –  Build the Starburst Trino image along with the Aerospike Presto Connector

git clone https://github.com/aerospike-examples/starburst-aerospike.docker.git
cd starburst-aerospike.docker
docker build . -t starburst-aerospike

Step 2 – Run the Aerospike Database and load the data

  1. Download and run the Aerospike Database Enterprise Edition
  2. Load data into Aerospike (you would need to download the Aerospike Python client)

Step 3 – Run the docker image that you created in Step 1

docker run --rm -p 8080:8080 -e AS_HOSTLIST=docker.for.mac.host.internal:3000 --name starburst-aerospike starburst-aerospike

Step 4 – Query using Trino CLI

Download the Trino CLI and run the following command:

Trino CLI: Catalog Aerospike and Schema Test

Trino CLI: Show catalogs

Trino CLI: Show tables

Trino CLI: Schema test

The above will launch the image in standalone mode i.e. coordinator and worker running on the same host, but you can easily extend it to the distributed mode for production purposes as described here. Alternatively, you can use a Trino supported BI tool such as Tableau or Jupyter notebook as well as using the PyHive library.

In conclusion, with the deployment options and the best in class support offered by Aerospike and Starburst, you can go from zero to insights to meet your business needs. The time that you would otherwise spend dealing with infrastructure issues can now be spent on tasks that matter most to your business.

Share:

About Author

mm

    Yevgeny Rizhkov, Senior Software Engineer

    All posts by this author
    Yevgeny is a Senior Software Engineer at Aerospike, where he’s responsible for a number of projects in the Aerospike Ecosystems group. He is an open-source contributor and polyglot technologist with more than 10 years of experience in various engineering positions.