Amazon S3 Sink Connector

Type

Sink

Class

io.aiven.kafka.connect.s3.AivenKafkaConnectS3SinkConnector

Target System

Amazon S3

Maintainer

Aiven

License

Apache License 2.0

Project

github.com/Aiven-Open/cloud-storage-connectors-for-apache-kafka

Download

GitHub Releases

This page documents version 3.4.2. Newer versions should be compatible unless there are breaking changes, but field names or default values may differ. If you notice discrepancies, please contact Axual Support.

Description

The Amazon S3 Sink Connector consumes records from Kafka topics and writes them as files into an Amazon S3 bucket.

It is maintained by Aiven as part of the open-source github.com/Aiven-Open/cloud-storage-connectors-for-apache-kafka.

Features

  • Write Kafka records to Amazon S3 as JSON or other format files

  • Configurable output format: json, avro, and more

  • Configurable AWS region and bucket

When to Use

  • You need to archive or export Kafka topic data to Amazon S3 for storage or batch processing.

  • You want to build a data lake from Kafka topics.

When NOT to Use

Installation

The connector is available from the GitHub Releases.

  1. Navigate to the releases page and select the version matching your Kafka Connect installation.

  2. Download the JAR file.

For installation steps, see Installing Connector Plugins.

Configuration

For the complete configuration reference, see the official connector documentation.

To configure a connector in Axual Self-Service, see Starting Connectors. TIP: To speed up your deployment, use the Terraform Boilerplate or the Management API Boilerplate.

Getting Started

Prerequisites

AWS account and S3 bucket

  • You already have access to an AWS subscription.

  • You have an S3 bucket (e.g. my-s3-kafka-connect-target).

  • You have an AWS (service) account with read/write permissions on the bucket.

  • You have the secret access key and key ID of the service account.

Axual stream

The Kafka stream this connector will consume must already exist and contain records in Axual Self-Service.

Steps

Step 1 — Create a connector application

  1. Follow the Creating streams documentation in order to create one stream and deploy it onto an environment.
    The name of the stream will be my_s3_connector_sink.
    The key/value types will be String/String.

  2. Follow the Configure and install a connector documentation to set up a new Connector-Application.
    Let’s call it my_s3_sink.
    The plugin name is io.aiven.kafka.connect.s3.AivenKafkaConnectS3SinkConnector.
    If a plugin isn’t available, ask a platform operator to install plugins.

Step 2 — Configure the connector

  1. Provide the following minimal configuration:

    aws.access.key.id

    Example value:
    AYEM7RPD4TAXLHLPM333

    aws.secret.access.key

    Example value:
    Hc89Cnwp3MnNvmYdJRzlOPOe2WZFWtXt7FndjRCi

    aws.s3.region

    Use the bucket’s real region
    eu-central-1

    aws.s3.bucket.name

    my-s3-kafka-connect-target

    format.output.type

    json

    topics

    my_s3_connector_sink

    key.converter

    org.apache.kafka.connect.storage.StringConverter

    value.converter

    org.apache.kafka.connect.storage.StringConverter

    For advanced options, see the official connector documentation.

  2. Authorize the my_s3_sink sink Connector-Application to consume the my_s3_connector_sink stream.

Step 3 — Start the connector

Start the connector application from Axual Self-Service.

Step 4 — Verify

Check the S3 bucket to see the files created by the connector. The bucket will contain one or more compressed archives which you can download and decompress. Their contents will be JSON files containing the values you produced.

You can produce test String/String events to the stream to confirm data flow.

Cleanup

When you are done:

  1. Stop the connector application in Axual Self-Service.

  2. Remove stream access for the application if no longer needed.

  3. Return to AWS and delete your service account and S3 bucket.

Known limitations

  • Files are written in batches — records do not appear in S3 immediately after being consumed from Kafka.

Examples

Minimal configuration

{
  "name": "my-s3-sink",
  "config": {
    "connector.class": "io.aiven.kafka.connect.s3.AivenKafkaConnectS3SinkConnector",
    "aws.access.key.id": "<your-access-key-id>",
    "aws.secret.access.key": "<your-secret-access-key>",
    "aws.s3.region": "eu-central-1",
    "aws.s3.bucket.name": "my-s3-kafka-connect-target",
    "format.output.type": "json",
    "topics": "my_s3_connector_sink",
    "key.converter": "org.apache.kafka.connect.storage.StringConverter",
    "value.converter": "org.apache.kafka.connect.storage.StringConverter"
  }
}

License

Amazon S3-Bucket sink connector is licensed under the Apache License, Version 2.0.