Amazon S3-Bucket sink connector

Amazon S3-Bucket

This document makes the following assumptions:

  • You already have access to an AWS subscription

  • You have an S3 bucket (optionally named my-s3-kafka-connect-target) within this AWS subscription

  • You already have access to an AWS (service) account, which has read/write permissions on the bucket

  • You have a secret access key and key ID of the (service) account

  • You have access to a kafka producer, which you can configure and run whenever you want

Configuring a new source connector

  1. Follow the 2022.3@axual::self-service/stream-management.html.adoc#creating-streams documentation in order to create one stream and deploy it onto an environment.
    The name of the stream will be my_s3_connector_source.
    The key/value types will be String.

  2. Follow the Configuring Connector-Applications documentation to set up a new connector application.
    Let’s call it my_s3_sink. The plugin name is "io.aiven.kafka.connect.s3.AivenKafkaConnectS3SinkConnector".
    The values you will need to supply as configuration will be listed in this section.
    Configure the security certificate as instructed.

  3. Configuring the Connector-application deployment:

    • Provide the following minimal configuration in order to connect to the previously configured Amazon S3-Bucket instance.
      For advanced configuration, see the official connector documentation.

      aws.access.key.id

      Example value
      AYEM7RPD4TAXLHLPM333

      aws.secret.access.key

      Example value
      Hc89Cnwp3MnNvmYdJRzlOPOe2WZFWtXt7FndjRCi

      aws.s3.region

      Use the bucket’s real region
      eu-central-1

      aws.s3.bucket.name

      my-s3-kafka-connect-target

      format.output.type

      json

      topics

      my_s3_connector_source

      key.converter

      org.apache.kafka.connect.storage.StringConverter

      value.converter

      org.apache.kafka.connect.storage.StringConverter

  4. Authorize the my_s3_sink Connector-Application to consume the my_s3_connector_source topic.

  5. You can now start the Connector-Application.

  6. Produce some String/String events to this stream. Follow the producer documentation and examples if needed.

  7. You can now check the S3 bucket to see the events published by the connector. The bucket will contain one or more compressed archives which you can download and decompress. Their contents will be JSON files containing the values you produced.

Cleanup

Once you are done, stop the Connector application and cleanup the unused axual resources.
Don’t forget to return to AWS and delete your service account and S3 bucket.

License

Amazon S3-Bucket sink connector is licensed under the Apache License, Version 2.0.

Source code

The source code for the connector can be found on https://github.com/aiven/s3-connector-for-apache-kafka