Schema Distribution

About Schema Distribution

The Confluent Schema Registry contains schema definitions for topics on the Kafka Cluster. It allows records to be produced without including the full schema in each record, but refer to the schema identifier in the registry. Schema Distribution is needed to make sure that the schema registries on all clusters have the same id, allowing a consumer to read the record and find the correct schema even if the producer produced this on a different Kafka Cluster.

The Schema Distributor does this by reading the schemas topic used by the Confluent Schema Registry, and distributes the contents to the topics on all other Kafka clusters. This introduced the requirement that schemas are only registered on a single cluster, and that the schema registries on other clusters are configured as read only registries. The Kafka cluster with registration enabled is called the primary cluster.

Schema Distribution is started on Distributor connected to the primary cluster, and distributes the contents of the schemas topic to the remote clusters.

Figure 1. Schema Distribution with Cluster 1 as primary cluster

The subjects in the schema registry are bound to the topic name, often in the form of <topic-name>-key and <topic-name>-value for the key and value schemas. When different clusters have different topic naming patterns the Schema Distributor will transform the subject names to match the naming pattern of the target cluster.

Enabling Schema Distributor with Distributor Helm Charts

See the Deploying Distributor page for instructions on deploying the Schema Distributor