Scaling Kafka Sink Connectors
When Kafka Connect is running in distributed mode, you can increase the throughput of changes from Kafka to your Aerospike cluster. You can do so by adding more workers to your Kafka Connect cluster, by adding more tasks for running the connector, or by doing both.
Before deciding to use any of these options, ensure that you understand the implications of each on the available capacity in your Kafka Connect cluster.
Check existing distributionโ
Run this curl
command to view how tasks for the connector are currently distributed in your Kafka Connect cluster:
curl -X GET --header "Content-Type:application/json" ${kafkaEndpoint}/connectors/aerospike-sink/status
where kafkaEndpoint
is the REST endpoint for the Kafka Connect service. You can make requests to any cluster member; the REST API automatically forwards requests, if required.
This sample output shows that the Kafka inbound connector is running on 192.168.0.1:8083
, and that it is divided into two tasks, each running on a separate worker.
HTTP/1.1 200 OK
{
"name": "aerospike-sink",
"connector": {
"state": "RUNNING",
"worker_id": "192.168.0.1:8083"
},
"tasks":
[
{
"id": 0,
"state": "RUNNING",
"worker_id": "192.168.0.1:8083"
},
{
"id": 1,
"state": "RUNNING",
"worker_id": "192.168.0.2:8083"
}
]
}
Add tasksโ
If you want to add one or more tasks, follow these steps:
Set tasks.maxโ
Set this variable:
aerosink =
{
"tasks.max": "<value>"
}
tasks.max
: The changed maximum number of tasks that can be created for the connector.
Set kafkaEndpointโ
On the same system, set this variable:
kafkaEndpoint="<URI>"
kafkaEndpoint
: This is the REST endpoint for the Kafka Connect service. You can make requests to any cluster member; the REST API automatically forwards requests, if required.
Update the tasksโ
Issue a request to Kafka Connect's REST interface. The request updates all of the connector tasks together.
curl -X PUT --header "Content-Type:application/json" --data ${aerosink} ${kafkaEndpoint}/connectors/aerospike-sink/config
Verify changesโ
Use this GET request to verify the connector is using the changed configuration:
curl -X GET --header "Content-Type:application/json" ${kafkaEndpoint}/connectors/aerospike-sink/status
Repeat this process if you want to add more tasks.
Add workersโ
If you want to add more workers to your Kafka Connect cluster, follow these steps.
Launch each workerโ
Run this command to launch each additional worker in distributed mode.
bin/connect-distributed <path-to-your-Kafka-Connect-config-file>
<path-to-your-Kafka-Connect-config-file>
: The path to the file (including the filename and extension) that you are using to configure the workers in Kafka Connect.
Verify changesโ
After a few minutes, Use this GET request to view how tasks for the connector are now distributed in your Kafka Connect cluster:
curl -X GET --header "Content-Type:application/json" ${kafkaEndpoint}/connectors/aerospike-sink/status
Repeat this process if you want to add more workers.