Schema Distribution
|
Schema Distribution is a legacy feature. It only works with the Confluent-based Legacy Schema Registry, because it copies that registry’s Kafka schemas topic from one cluster to the others. It does not work with Apicurio, which is now the preferred Schema Registry. If you use Apicurio, do not enable the Schema Distributor. See Schema Distribution with Apicurio. |
About Schema Distribution
The Confluent-based Legacy Schema Registry contains schema definitions for topics on the Kafka Cluster. It allows records to be produced without including the full schema in each record, but refer to the schema identifier in the registry. Schema Distribution is needed to make sure that the schema registries on all clusters have the same id, allowing a consumer to read the record and find the correct schema even if the producer produced this on a different Kafka Cluster.
The Schema Distributor does this by reading the schemas topic used by the Confluent Schema Registry, and distributes the contents to the topics on all other Kafka clusters. This introduced the requirement that schemas are only registered on a single cluster, and that the schema registries on other clusters are configured as read only registries. The Kafka cluster with registration enabled is called the primary cluster.
Schema Distribution is started on Distributor connected to the primary cluster, and distributes the contents of the schemas topic to the remote clusters.
The subjects in the schema registry are bound to the topic name, often in the form of <topic-name>-key and <topic-name>-value for the key and value schemas. When different clusters have different topic naming patterns the Schema Distributor will transform the subject names to match the naming pattern of the target cluster.
Schema Distribution with Apicurio
Apicurio does not keep its schemas in a simple Kafka topic that can be copied between clusters. So Apicurio schemas cannot be replicated through Kafka, and the Schema Distributor cannot be used with Apicurio.
For a multi-cluster setup with Apicurio, use one shared Apicurio instance that every Kafka cluster can reach. All clusters then use the same schemas with the same schema ids, so no schema distribution is needed. This is the only supported way to share schemas across clusters with Apicurio today.
|
What you must do: give every cluster that uses this shared Apicurio the same topic naming pattern. This is the Why: Apicurio saves each schema under a name called the subject. Axual builds the subject from the real (technical) Kafka topic name. So the same topic must get the same technical name on every cluster. There is no Schema Distributor here to change names between clusters, so if the patterns are different, one topic gets a different subject on each cluster and reading or registering schemas fails. This rule is only about sharing schemas. Message distribution still works when clusters use different patterns, because each record carries its schema id and the consumer finds the schema by that id. |
Do not enable schemaDistributor in the Distributor Helm chart when you use Apicurio. It would only create an unused schemas topic and access control entries, and would not distribute any schemas.
Enabling Schema Distributor with Distributor Helm Charts
See the Deploying Distributor page for instructions on deploying the Schema Distributor