Cluster Stack Upgrade

To be able to start upgrading the Cluster Stack, you need to have upgraded all Self-Service instances.

This Cluster Stack Upgrade needs to be executed for each Kafka cluster defined in your Axual installation.

Start a standalone Cluster-API

Objective

Come up with a new values.yaml file to start a new Cluster-API with axual-helm-charts alongside existing Kafka.

Execution

To avoid downtime or customer impact, perform this execution outside office hours

Standalone Cluster-API

  1. Define a new Chart.yaml that defines as dependency the platform:0.17.9 charts.

    Chart.yaml
    apiVersion: v2
    appVersion: "2024.1"
    description: Cluster API with Axual Helm Charts
    name: just-cluster-api
    type: application
    version: 0.1.0
    dependencies:
      - name: "platform"
        version: "0.17.9"
        repository: "oci://registry.axual.io/axual-charts"
  2. Create a new values.yaml using the existing Cluster’s values.yaml used by the Axual Helm Charts as reference.

    First, disable the components that are not Cluster-API

    values.yaml
    global:
      cluster:
        enabled: true
        name: [existing-cluster-name]
    
        strimzi:
          enabled: false
        clusterbrowse:
          enabled: false
    
      instance:
        enabled: false
    
      mgmt:
        enabled: false
  3. In the core.clusterapi section, provide the same configuration used by the existing Cluster’s values.yaml

    values.yaml
    platform:
      core:
        clusterapi:
          [existing-configuration]

You will not be able to deploy it because the resources will still exist from the existing Cluster stack deployment.

Existing Cluster-API

  1. In the existing Cluster stack deployment, disable the Cluster-API

    values.yaml
    global:
      cluster:
        clusterapi:
          enabled: false
  2. Upgrade the existing Cluster stack deployment to disable the Cluster-API

  3. Start the new Standalone Cluster deployment to replace the disabled Cluster-API

Perform these two above steps in a fast sequence to reduce to the minimum the amount of time when a restart of Discovery-API and Schema-Registry might fail due to missing Cluster-API.

The Self-Service functionalities are not affected since all the clusters have been made not requiring Cluster-API

Verification

In this step, we are going to verify that the new Cluster-API can access the existing Kafka.

You can verify this be either

  • Restart a Discovery-API deployment

  • Restart a Schema-Registry deployment

If all checks are successful, you can proceed to the next steps.

Upgrade Axual Operator to the Strimzi Operator

Objective

Change the Operator available in the k8s-cluster which deploys Kafka and Zookeeper pods

Be sure that you are using the same version of the Axual Operator

Execution

Based on your way to deploy the Axual Operator, you might have two different ways.

Helm Upgrade Command

  1. Download the Strimzi public charts

    helm repo add strimzi https://strimzi.io/charts/
  2. Upgrade the existing Axual Operator with the new values

    helm upgrade --install [existing-release-name] strimzi/strimzi-kafka-operator \
            --version=0.34.0 \
            --namespace [existing-namespace] \
            --set watchAnyNamespace=true \
            --set kafka.image.registry=registry.axual.io \
            --set kafka.image.repository=axual/streaming/strimzi \
            --set image.imagePullSecrets=[existing-docker-secret-name]

Chart.yaml and Values.yaml

  1. Update the existing Chart.yaml to use strimzi-kafka-operator as dependency chart.

    Chart.yaml
    apiVersion: v2
    name: "strimzi-operator"
    type: "application"
    version: "0.34.0"
    appVersion: "2024.1"
    description: Strimzi Operator
    dependencies:
      - name: "strimzi-kafka-operator"
        version: "0.34.0"
        repository: "https://strimzi.io/charts/"
  2. Update the existing values.yaml to pull a different Kafka image from Axual Registry.

    values.yaml
    strimzi-kafka-operator:
      watchAnyNamespace: true
      createGlobalResources: true
      # Adjust to your needs
      resources:
        limits:
          memory: 512Mi
        requests:
          memory: 512Mi
          cpu: 200m
      image:
        imagePullSecrets: [existing-docker-secret-name]
      kafka:
        image:
          registry: "registry.axual.io"
          repository: "axual/streaming/strimzi"

This will cause rolling restarts of Zookeeper and Kafka pods.

Verification

In this step, we are going to verify that the new Zookeeper and Kafka pods deployed with Strimzi Operator work as the old Zookeeper and Kafka pods deployed with Axual Operator.

You can verify this be either

  • Confirming that your producer/consumer applications are running fine

  • Logging into the Self-Service and perform a topic deployment

  • Logging into the Self-Service and browse a topic deployment

If all checks are successful, you can proceed to the next steps.

Upgrade Kafka

Objective

Make the existing Kafka installation to be deployed using the axual-streaming-charts.

Execution

Before upgrading the Kafka deployment, check the diffs with ArgoCD, with helm diff upgrade --install, or in the way supported from the tool you are using to deploy the charts.

  1. Replace the dependency in the Cluster Chart.yaml file.

    Chart.yaml
    apiVersion: v2
    appVersion: "2024.1"
    description: Kafka Stack with Streaming Charts
    name: [existing-cluster-name]
    type: application
    version: 0.1.0
    dependencies:
      - name: "axual-streaming"
        version: "0.3.3"
        repository: "oci://registry.axual.io/axual-charts"
  2. Copy the existing platform.core.kafka section from the existing values.yaml file so that you can replace some keys.

  3. Disable the components expect kafka

    values.yaml
    global:
    
      rest-proxy:
        enabled: false
    
      apicurio:
        enabled: false
    
      axual-schema-registry:
        enabled: false
  4. Add the Cluster name to the axual-streaming.kafka.fullnameOverride key

    values.yaml
    axual-streaming:
      kafka:
        fullnameOverride: [existing-cluster-name]

    Be sure that to use the right cluster-name to avoid any Kafka restart

  5. Add the Internal Listeners Configuration to the axual-streaming.kafka.kafka key

    values.yaml
    axual-streaming:
      kafka:
        kafka:
          # Kafka internal listener configuration
         internalListenerTlsEnabled: "true"
         internalListenerAuthenticationType: tls
  6. Replace platform.core.kafka key with axual-streaming.kafka key

    existing_values.yaml
    platform:
      core:
        kafka:
          [existing-content]
    new_values.yaml
    axual-streaming:
      kafka:
        [existing-content]
  7. Replace global.cluster.kafka.nodes key with axual-streaming.kafka.kafka.replicas key

    existing_values.yaml
    global:
      cluster:
        kafka:
          nodes: [existing-value]
    new_values.yaml
    axual-streaming:
      kafka:
        kafka:
          replicas: [existing-value]
  8. Replace global.cluster.zookeeper.nodes key with axual-streaming.kafka.zookeeper.replicas key

    existing_values.yaml
    global:
      cluster:
        zookeeper:
          nodes: [existing-value]
    new_values.yaml
    axual-streaming:
      kafka:
        zookeeper:
          replicas: [existing-value]
  9. Replace platform.core.kafka.rackEnabled key with axual-streaming.kafka.kafka.rack.enabled key

    existing_values.yaml
    platform:
      core:
        kafka:
          kafka:
            rackEnabled: [existing-value]
    new_values.yaml
    axual-streaming:
      kafka:
        kafka:
          rack:
            enabled: [existing-value]
  10. Replace platform.core.kafka.rackTopologyKey key with axual-streaming.kafka.kafka.rack.topologyKey key

    existing_values.yaml
    platform:
      core:
        kafka:
          kafka:
            rackTopologyKey: [existing-value]
    new_values.yaml
    axual-streaming:
      kafka:
        kafka:
          rack:
            topologyKey: [existing-value]
  11. Replace platform.core.kafka.rackEnabled key with axual-streaming.kafka.zookeeper.rack.enabled key

    existing_values.yaml
    platform:
      core:
        kafka:
          kafka:
            rackEnabled: [existing-value]
    new_values.yaml
    axual-streaming:
      kafka:
        zookeeper:
          rack:
            enabled: [existing-value]
  12. Replace platform.core.kafka.rackTopologyKey key with axual-streaming.kafka.zookeeper.rack.topologyKey key

    existing_values.yaml
    platform:
      core:
        kafka:
          kafka:
            rackTopologyKey: [existing-value]
    new_values.yaml
    axual-streaming:
      kafka:
        zookeeper:
          rack:
            topologyKey: [existing-value]
  13. Replace platform.core.kafka.superUsers key with axual-streaming.kafka.kafka.authorization.superUsers key

    existing_values.yaml
    platform:
      core:
        kafka:
          kafka:
            superUsers:
              [existing-value]
    new_values.yaml
    axual-streaming:
      kafka:
        kafka:
          authorization:
            superUsers:
              [existing-value]
  14. In the Kafka security section, provide the correct values for clientsCaCertGeneration, clientsCaGeneration, clusterCaCertGeneration, clusterCaGeneration

    values.yaml
    axual-streaming:
        kafka:
            kafka:
                security:
                    clientsCaCertGeneration: "0"
                    clientsCaGeneration: "0"
                    clusterCaCertGeneration: "0"
                    clusterCaGeneration: "0"
  15. If the configuration fully matches, there will be no restart of the cluster. If there are differences, keep adapting the values.yaml until all the diffs are gone.

Verification

In this step, we are going to verify that the existing Zookeeper and Kafka pods deployed with Strimzi Operator and Streaming Charts work as expected.

You can verify this be either

  • Confirming that your producer/consumer applications are running fine

  • Logging into the Self-Service and perform a topic deployment

  • Logging into the Self-Service and browse a topic deployment

If all checks are successful, you can proceed to the next steps.