Deploying Trino Clusters and the Trino Connector in Docker
You can deploy a Trino cluster and the Trino connector in Docker containers. You build a single image that consists of the connector and the Trino version. Then, you launch the image by running the docker run
command.
There are two different deployment modes. You can choose the mode that better supports your use case.
Standalone
Runs a single Trino node and the Trino connector in a single container. The Trino node runs both the coordinator and a worker. This mode can help you with rapid application prototyping when you use it together with an Aerospike database launched within a separate container.
Distributed
Runs multiple containers, each running a node of a Trino cluster and an instance of the Trino connector. This mode offers the scale required for production workloads.
No matter which deployment mode you choose, you can easily configure the connector by using either environment variables in the docker run
command or by using Docker bind-mounts to mount your configuration files.
Prerequisitesโ
- Ensure that your Aerospike cluster is at Aerospike Database Enterprise Edition, version 5.2 or later.
- Ensure that the feature key for Aerospike Connect for Trino is installed on each node of your Aerospike cluster.
- To use secondary index (SI) query support you need Aerospike Database 6.0 or later.
- If you are using Trino (formerly known as TrinoSQL), ensure that you are using a compatible version of the Trino connector.
- Download and install Docker from here.
- If you plan to deploy in distributed mode, ensure that you have access to a custom Docker registry or to Docker Hub for publishing Docker images for Trino worker nodes to access. For information about Docker registries, see Docker Registry in the Docker documentation. Docker Hub is located at https://hub.docker.com/.
Procedureโ
Clone Aerospike's GitHub repository trino-aerospike.docker
โ
Run this git
command on the system where you installed Docker:
git clone https://github.com/aerospike/trino-aerospike.docker
Build a Docker image of Trino and the Trino connectorโ
Change to the trino-aerospike.docker
directory.
Run the docker build
command.
This example uses default versions of Trino and the Trino connector:
docker build . -t trino-aerospike
The option -t
specifies the name to give to the image. If you want to specify which version of Trino or the Trino connector to use, include the option --build-arg
and the TRINO_VERSION
or CONNECTOR_VERSION
build arguments, as in this example:
docker build --build-arg TRINO_VERSION=351 --build-arg CONNECTOR_VERSION=1.0.0 . -t trino-aerospike
Configure Trino propertiesโ
If you need to configure Trino properties to suit your test or production environment, edit the node.properties
, jvm.config
, config.properties
, and log.properties
files in /trino-aerospike.docker/docker/etc
. See the Trino documentation for the version that you are using for more information on how to edit them.
Later, when you deploy a Trino cluster, you mount the directory that contains these files to each container that you create with the docker run
command.
(Optional) Specify your Trino schemasโ
Specify the Trino schemas that correspond to the Aerospike sets your client applications will query.
By default, the Trino connector uses heuristics to rapidly infer schemas without requiring you to specify them. However, you can choose to provide the schemas.
Later, when you deploy a Trino cluster, you mount the directory that contains these files to each container that you create with the docker run
command.
Deploy Trino and the Trino connector in Dockerโ
What to do nextโ
Run test queries against your Aerospike database by using the Trino CLI, which you can install from here. For examples of how to do this, see Examples of Querying Aerospike Databases with the Trino CLI.
After testing, you can connect to Trino by using the business-intelligence or visualization tool of your choice, such as Tableau Desktop (version 2020.3 or later) or Jupyter Notebook, and start your analysis of data in Aerospike.
No matter which tool you use for your data analysis, remember that:
- You must use the name you catalogued for your Aerospike database as the name of the Trino catalog. This name is also used for the
.properties
file in which configuration properties are set for the interactions between your Aerospike database and the Trino connector. - You must use the name of the Aerospike namespace where your data resides as the name of the Trino schema.
- Sets in Aerospike correspond to tables in Trino.