Offset Distribution

Offset Distribution

Kafka consumer applications can store the position for one or more topic partitions inside Kafka. This allows the consumer application to continue consuming where it left off in case the application is stopped or fails. Offset Distribution makes sure the committed offsets are distributed to all Kafka Clusters of the tenant instance. This enables consumer applications to migrate from one Kafka Cluster to another Kafka Cluster. An example would be an application moving from on premise installation to a the cloud, and using the cloud Kafka Cluster.

The Distribution Model controls if committed offsets should be distributed, and which clusters it should be distributed to.

How it works

The Offset Distributor syncs consumerGroup offsets from the source cluster to a target cluster to allow for consumer applications to start on the target cluster without having to process all data on the topic.
Suppose messages are distributed using the Message Distributor to a new Kafka cluster, there is no consumergroup offset stored, so potentially weeks of messages would be processed again by a new consumer application using the same group ID.
The Offset Distributor syncs the consumergroup offsets often by determining timestamps of all consumerGroups on all Kafka topics gathered from the __consumer_offsets topic of the source cluster.
On the target cluster, an Offset Committer is running that receives the offset timestamps from the Offset Distributor and sets consumergroup offsets based on the timestamps.
There is some delay (below 1 minute) in this process, so there may still be some limited double processing of data.

Check status

From an operating standpoint it is important to validate:

  • Is the offset distributor running on the source cluster?

  • Is the offset committer running on the target cluster?

In combination with message distributor this may result in 4 connector tasks running for each Kafka cluster in case of a bilateral or multilateral Distribution setup:

  • message distributor

  • offset distributor

  • offset committer

  • (optional) schema distributor

Enabling Offset Distributor with Distributor Helm Charts

See the Deploying Distributor page for instructions on deploying the Offset Distributor

Enabling Offset Committer with Distributor Helm Charts

See the Deploying Distributor page for instructions on deploying the Offset Committer

Check functionality

Offset Distributor

To see if the offset distributor on the source cluster is running (apart from status, logs etc), you can get consumergroup details from the broker, by opening a shell to the Pod (or using RedPanda) for example.

  • First find the offset-distributor consumergroup:

    • /opt/kafka/bin/kafka-consumer-groups.sh --command-config /tmp/client --bootstrap-server $kafkahost --list
      the offset-distributor consumergroup is named _<cluster_name>-offset-distributor-level-X-to-X
      for example: _axual-demo-cluster01-offset-distributor-level-31-to-target.

  • Then describe the consumergroup as such:

    • /opt/kafka/bin/kafka-consumer-groups.sh --command-config /tmp/client --bootstrap-server $kafkahost --describe --group _<clustername>-offset-distributor-level-X-to-X

    • you should see consumergroup current-offset and log-end-offset on the many partitions of the __consumer-offsets topic.
      Note the LAG and verify the number gets reduced periodically, this shows the offset distributor is active.

Offset Distributor & Committer

To verify offset distribution on the target cluster, you can compare all consumergroup offsets of the source cluster with the target cluster.
For example this command /opt/kafka/bin/kafka-consumer-groups.sh --command-config /tmp/client --bootstrap-server $kafkahost --describe --offsets --all-groups will show all offsets and lags of all consumer groups.
A properly working offset Distribution results in rapid updates on the target cluster on consumergroups that have no active consumer application.