Skip to main content
Loading

Deploying in Distributed Mode in Docker

In the distributed mode, you use the Docker image that you built in "Deploying Trino Clusters and the Trino Connector in Docker" to create:

  • A container that runs the Trino connector and the coordinator node of a Trino cluster.
  • One or more containers that each run the Trino connector and a worker node of the Trino cluster.

Distributed mode is more suitable for production environments than standalone mode.

Prerequisiteโ€‹

  • Build an image of Trino and the Trino connector, as described in "Deploying Trino Clusters and the Trino Connector in Docker".
  • Ensure that you have access to Docker Hub or a custom Docker registry, and place the image either in Docker Hub or in the registry, so that each Trino worker node that you start has access to it.

Options for running a distributed environmentโ€‹

There are two options:

Procedure for using environment variables to configure the Trino connectorโ€‹

Consider using this option when you do not plan to use an aerospike.properties file. Running a Trino cluster and the Trino connector this way can help you rapidly prototype running SQL queries against your Aerospike cluster.

Start a coordinator nodeโ€‹

Run a docker run command to start a container that hosts and starts the coordinator node.

The environment variables let you specify values for a subset of the connector settings that the configuration properties cover.

Use the -e option for each environment variable that you want to include.

  • Include the AS_HOSTLIST environment variable.

  • Include the TRINO_NODE_TYPE environment variable and set it to coordinator.

Here is an example, where

  • --rm specifies to remove the container automatically when the command exits.
  • -p 8080:8080 binds port 8080 of the container to port 8080 of the host system.
  • <container-name> is the name given to reference the container within a Docker network.
  • <image name> is the name given to the Docker image when it was built.
docker run --rm -p 8080:8080 -e AS_HOSTLIST=docker.for.mac.host.internal:3000 -e TRINO_NODE_TYPE=coordinator --name <container name> <image name>

If you configured Trino properties, include this option to mount the files into the /etc/trino directory in the container:

-v /docker/etc:/etc/trino

where /docker/etc is the path of the directory in which the configuration files that you edited are located in trino-aerospike.docker, the repository that you cloned in a previous step. If the Trino configuration files that you edited are in a different directory, use the path of that directory, instead. You can even mount individual files, as in this example:

-v /docker/etc/log.properties:/etc/trino/log.properties

If you are providing the Trino schemas, include this option to mount the files into the /etc/trino/aerospike directory in the container:

-v <path-to-JSON-files>:/etc/trino/aerospike

where <path-to-JSON-files> is the path of the folder on your system in which you placed the JSON files.

Start a worker nodeโ€‹

Run a docker run command to start a container that hosts and starts a worker node, or run a bash script if you want to start multiple worker nodes, each in a separate container.

Use the same command that you used to start the coordinator node. However, change the value of the environment variable TRINO_NODE_TYPE to worker.

You must also set a value for the TRINO_DISCOVERY_URI environment variable if you set a non-default URI for the coordinator node. You can set this URI in the Trino configuration file config.properties in step 3 of "Deploying Trino Clusters and the Trino Connector in Docker". The default is http://localhost:8080. When you specify the URI in a web browser, Trino shows you a dashboard with which you can monitor the health of your distributed cluster.

If you want to create a bash script to start multiple worker nodes, ensure that each instance of the docker run command has a different value for the --name option. For example, your bash script might look like this, where trino-aerospike is the name of the Docker image:

docker run --rm -p 8080:8080 -e AS_HOSTLIST=docker.for.mac.host.internal:3000 -e TRINO_NODE_TYPE=worker --name trino-worker-1 trino-aerospike

docker run --rm -p 8080:8080 -e AS_HOSTLIST=docker.for.mac.host.internal:3000 -e TRINO_NODE_TYPE=worker --name trino-worker-2 trino-aerospike

docker run --rm -p 8080:8080 -e AS_HOSTLIST=docker.for.mac.host.internal:3000 -e TRINO_NODE_TYPE=worker --name trino-worker-3 trino-aerospike

docker run --rm -p 8080:8080 -e AS_HOSTLIST=docker.for.mac.host.internal:3000 -e TRINO_NODE_TYPE=worker --name trino-worker-4 trino-aerospike

Procedure for configuring the Trino connector through a configuration fileโ€‹

With this option, you configure the connector by setting values for its configuration properties in the aerospike.properties file. The number of properties that you can configure with this file is greater than the number that you can configure with the environment variables.

When you run the docker run command, you mount that file's directory in the container that the command creates.

Set values in the aerospike.properties fileโ€‹

Set values for configuration properties in aerospike.properties. In the repository that you cloned in a previous step, the path to this file is /trino-aerospike.docker/docker/etc/catalog/aerospike.properties. By default, the file in the repository includes these two entries. If you are running Docker on Linux, replace docker.for.mac.host.internal with localhost.

connector.name=aerospike
aerospike.hostlist=docker.for.mac.host.internal:3000

If you have an existing aerospike.properties file that you want to use, be sure to change the value of aerospike.hostlist.

Create a container and start the coordinatorโ€‹

Run the docker run command to create a container and to start the coordinator node and the Trino connector within the container.

To mount the aerospike.properties file into the /etc/trino/catalog directory in the container, use the following command:

docker run --rm -p 8080:8080 -e TRINO_NODE_TYPE=coordinator -v <path-to-the-aerospike.properties-file>:/etc/trino/catalog \
--name <container name> <image name>

where:

  • --rm specifies to remove the container automatically when the command exits.
  • -p 8080:8080 binds port 8080 of the container to port 8080 of the host system.
  • <path-to-the-aerospike.properties-file> is the path to the aerospike.properties file on your system.
  • <container-name> is the name given to reference the container within a Docker network.
  • <image name> is the name given to the Docker image when it was built.

If you configured Trino properties, include this option to mount the files into the /etc/trino directory in the container:

-v /docker/etc:/etc/trino

where /docker/etc is the path of the directory in which the configuration files that you edited are located in trino-aerospike.docker, the repository that you cloned in a previous step. If the Trino configuration files that you edited are in a different directory, use the path of that directory, instead. You can even mount individual files, as in this example:

-v /docker/etc/log.properties:/etc/trino/log.properties

If you are providing the Trino schemas, include this option to mount the files into the /etc/trino/aerospike directory in the container:

-v <path-to-JSON-files>:/etc/trino/aerospike

where <path-to-JSON-files> is the path of the folder on your system in which you placed the JSON files.

Start one or more worker nodesโ€‹

Run the same docker run command that you used to start the coordinator node. However, change the value of the environment variable TRINO_NODE_TYPE to worker.

You must also set a value for the TRINO_DISCOVERY_URI environment variable if you set a non-default URI for the coordinator node. You can set this URI in the Trino configuration file config.properties in step 3 of "Deploying Trino Clusters and the Trino Connector in Docker". The default is http://localhost:8080. When you specify the URI in a web browser, Trino shows you a dashboard with which you can monitor the health of your distributed cluster.

If you want to create a bash script to start multiple worker nodes, ensure that each instance of the docker run command has a different value for the --name option. For example, your bash script might look like this, where trino-aerospike is the name of the Docker image:

docker run --rm -p 8080:8080 -e TRINO_NODE_TYPE=worker -v $HOME/Documents/Trino/docker/etc/catalog:/etc/trino/catalog \
-v "$(pwd)"/docker/etc/aerospike:/etc/trino/aerospike --name trino-worker-1 trino-aerospike

docker run --rm -p 8080:8080 -e TRINO_NODE_TYPE=worker -v $HOME/Documents/Trino/docker/etc/catalog:/etc/trino/catalog \
-v "$(pwd)"/docker/etc/aerospike:/etc/trino/aerospike --name trino-worker-2 trino-aerospike

docker run --rm -p 8080:8080 -e TRINO_NODE_TYPE=worker -v $HOME/Documents/Trino/docker/etc/catalog:/etc/trino/catalog \
-v "$(pwd)"/docker/etc/aerospike:/etc/trino/aerospike --name trino-worker-3 trino-aerospike

docker run --rm -p 8080:8080 -e TRINO_NODE_TYPE=worker -v $HOME/Documents/Trino/docker/etc/catalog:/etc/trino/catalog \
-v "$(pwd)"/docker/etc/aerospike:/etc/trino/aerospike --name trino-worker-4 trino-aerospike