Kafka 4 upgrade

Introduction

As soon as the prerequisites for the upgrade are met (no unsupported Kafka configs, no old Kafka clients), the upgrade process can be started. The process of upgrading is very straightforward.

Initial situation

The runbook assumes that the current versioning of components is

  1. Strimzi Cluster operator version 0.47.0

  2. Kafka version 3.9.1

  3. The Kafka cluster is in KRaft mode

Upgrade steps

Step 1

  1. Configuration changes

    1. version bump 3.9.1 → 4.0.0, through Kafka configuration.

    2. update logging configuration from log4j → log4j2, in case of using external logging configuration. See the partial values.yaml snapshot below (or the axual-kafka charts) for an example.

      Click to open log4j2 external configuration (from values.yaml)
        logging:
          type: external
          externalConfig: |-
            # Root logger configuration
            rootLogger.level = INFO
            rootLogger.appenderRefs = console
            rootLogger.appenderRef.console.ref = CONSOLE
      
            # Console appender with
            appender.console.type = Console
            appender.console.name = CONSOLE
            appender.console.target = SYSTEM_OUT
      
            # Console layout configuration
            # appender.console.layout.type = PatternLayout
            # appender.console.layout.pattern = [%d] %p %m (%c)%n
            appender.console.layout.type = JsonTemplateLayout
            appender.console.layout.eventTemplateUri = classpath:LogstashJsonEventLayoutV1.json
      
            # Logger configurations
            logger.kafka_controller.name = kafka.controller
            logger.kafka_controller.level = TRACE
      
            logger.kafka_network_processor.name = kafka.network.Processor
            logger.kafka_network_processor.level = FATAL
      
            logger.kafka_request_channel.name = kafka.network.RequestChannel$
            logger.kafka_request_channel.level = WARN
      
            logger.kafka_common_selector.name = org.apache.kafka.common.network.Selector
            logger.kafka_common_selector.level = WARN
      
            logger.kafka_request_logger.name = kafka.request.logger
            logger.kafka_request_logger.level = WARN
      
            logger.kafka_apis.name = kafka.server.KafkaApis
            logger.kafka_apis.level = FATAL
      
            logger.state_change.name = state.change.logger
            logger.state_change.level = TRACE
      
            logger.kafka_authorizer.name = kafka.authorizer.logger
            logger.kafka_authorizer.level = WARN
  2. Rolling restart of the Kafka cluster at this point.

This is a point where the upgrade can end, the final state being Strimzi Operator version 0.47.0 + Kafka version 4.0.0. Which is specifically true when there are other Kafka clusters running, managed by the current operator that are not on Kafka 4.0.0 yet.

Step 2

This step involves an upgrade of the Strimzi Operator as well. As such, this can only be done once all the Kafka clusters managed by the operator are on Kafka 4.0.0

  1. Pause reconciliation of the Kafka clusters

    1. kubectl annotate kafka my-cluster strimzi.io/pause-reconciliation="true"

    2. the Kafka resource becomes “not ready”, in ArgoCD seen as a “progressing” app state

  2. Update the Strimzi CRDs version 0.47.0 → 0.48.0

  3. Update the Strimzi operator version 0.47.0 → 0.48.0

  4. Update Kafka version to 4.1.0

  5. Remove the pause reconciliation annotation from the Kafka resource.

After this final step, the Kafka cluster will perform a rolling update to adopt version 0.48.0 of the Operator and 4.1.0 of Kafka. This is also the final step as far as the upgrade to Kafka 4 goes.