Performing the upgrade
Typical upgrade steps
A lot of the upgrade steps can be performed without impact on your users. Basically, the deployment or upgrade of components is split in two actions:
-
Configuration changes, such as added or changed configuration parameters, including the new component’s version
-
Deployment of the upgraded component, by (re)starting
The configuration changes can be done in advance most of the time, limiting downtime for your end users.
In the following upgrade steps, platform-config refers to the location where your platform configuration is stored for that particular environment.
|
Verifying every step of the way
When performing the upgrade, we strongly advise to verify whether things are working every step of the way. It is pointless to continue the upgrade if halfway 1 of the services fail to start. In general, we can give you the following tips that apply to every service when performing (re)starts after an upgrade:
-
Check whether the new docker image version has been pulled successfully
-
Check whether the container actually starts and is at least up for > 30 seconds, and not in "Restarting" mode
There are also verification steps that depend on the service which is being upgraded. Those steps can be found in the upgrade docs itself. |
Performing the upgrade
Step 1 - Upgrade broker to 6.0.0
Make sure the default version in the configuration is not overwritten. If it is, please remove or comment out the setting as shown below.
platform-config/clusters/{cluster-name}/broker.sh
# Docker image version, only override if you explicitly want to use a different version
#BROKER_VERSION='[OLDER VERSION]'
This upgrade must be done broker by broker in a rolling fashion |
Restart the broker; enter the following command:
axual.sh -v restart cluster broker
Wait for the broker to restart, and the under replicated partition count on the Broker stats per node dashboard to go back to zero. |
The following output is expected (for a 3 cluster broker):
---
Axual Platform 2021.1
Loading cluster definition: [Cluster A]
PREFIX for loading cluster definition is: CLUSTER_[Tenant]_[Cluster A]_
Loading cluster definition: [Cluster B]
PREFIX for loading cluster definition is: CLUSTER_[Tenant]_[Cluster B]_
Loading cluster definition: [Cluster C]
PREFIX for loading cluster definition is: CLUSTER_[Tenant]_[Cluster C]_
Analyzing cluster [Cluster B]-inter-broker-listener
Analyzing cluster [Cluster B]-mgmt-api-db
Analyzing cluster [Cluster B]
Analyzing cluster [Cluster A]-inter-broker-listener
Analyzing cluster [Cluster A]
Analyzing cluster [Cluster C]-inter-broker-listener
Analyzing cluster [Cluster C]
Hostname: '[Worker Hostname]'. Using configuration for cluster '[Cluster C]', node ID: '1'
Loading tenant definition: axual
Loading tenant definition: [Tenant]
Loading instance definition: [Tenant]-[Instance A]
Loading instance definition: [Tenant]-[Instance B]
Stopping cluster services for node [Worker Hostname] in cluster [Cluster C]
...
Stopped
Loading cluster definition: [Cluster A]
PREFIX for loading cluster definition is: CLUSTER_[Tenant]_[Cluster A]_
Loading cluster definition: [Cluster B]
PREFIX for loading cluster definition is: CLUSTER_[Tenant]_[Cluster B]_
Loading cluster definition: [Cluster C]
PREFIX for loading cluster definition is: CLUSTER_[Tenant]_[Cluster C]_
Analyzing cluster [Cluster B]-inter-broker-listener
Analyzing cluster [Cluster B]-mgmt-api-db
Analyzing cluster [Cluster B]
Analyzing cluster [Cluster A]-inter-broker-listener
Analyzing cluster [Cluster A]
Analyzing cluster [Cluster C]-inter-broker-listener
Analyzing cluster [Cluster C]
Hostname: '[Worker Hostname]'. Using configuration for cluster '[Cluster C]', node ID: '1'
Loading tenant definition: axual
Loading tenant definition: [Tenant]
Loading instance definition: [Tenant]-[Instance A]
Loading instance definition: [Tenant]-[Instance B]
Configuring cluster services for node [Worker Hostname] in cluster [Cluster C]
...
Preparing broker: Done
Param 1: run
Param 2: broker
Param 3: axual/broker:6.0.0
Param 4: Starting broker
Param 5: Done
Param 6: -d --tty=false --restart=always -v /appl/kafka/config/broker:/config/broker ...
Param 7:
...
Done
---
Step 2 - Upgrade Instance API to 3.0.1
Make sure the default version in the configuration is not overwritten. If it is, please remove or comment the setting as shown below.
platform-config/tenants/{tenant-name}/instances/{instance-name}/instance-api.sh
# Docker image version, only override if you explicitly want to use a different version
#INSTANCEAPI_VERSION='[OLDER_VERSION]'
Restart the service
axual.sh restart instance <instance-name> instance-api
The following output is expected:
Axual Platform 2021.1
Stopping instance services for [INSTANCE NAME] in cluster [CLUSTER NAME]
Stopping [INSTANCE NAME]-instance-api: Stopped
Done, cluster-api is available
Deploying topic _[INSTANCE NAME]-schemas: Done
Deploying topic _[INSTANCE NAME]-consumer-timestamps: Done
Done, cluster-api is available
Done, cluster-api is available
Applying ACLs : {...}
Done
Done, cluster-api is available
Applying ACLs : {...}
Done
Done, cluster-api is available
Applying ACLs : {...}
Done
Configuring instance services for [INSTANCE NAME]-[CLUSTER NAME] in cluster [CLUSTER NAME]
Preparing [INSTANCE NAME]-instance-api: Done
Cluster servers are https://[CLUSTER ENDPOINT]:9080
Starting [INSTANCE NAME]-instance-api: Done
Step 3 - Upgrade Cluster Browse to 1.1.1
Make sure the default version in the configuration is not overwritten. If it is, please remove or comment the setting as shown below.
platform-config/clusters/{cluster-name}/cluster-browse.sh
# Docker image version, only override if you explicitly want to use a different version
#CLUSTER_BROWSE_VERSION='[OLDER_VERSION]'
Restart the service
axual.sh restart cluster cluster-browse
The following output is expected:
Axual Platform 2021.1
Stopping cluster services for node [NODE NAME] in cluster [CLUSTER NAME]
Stopping cluster-browse: Stopped
Configuring cluster services for node [NODE NAME] in cluster [CLUSTER NAME]
Preparing cluster-browse: Done
Preparing acls:
Done, cluster-api is available
Applying ACLs : {...}
Done
Starting cluster-browse: Done
Step 4 - Upgrading to Schema-Registry 5.0.4
In the below steps, we are going to set up Schema-Registry to use the 5.0.4 version.
Step 4a - Configuring Schema-Registry
Make sure the default version in the configuration is not overwritten. If you did please remove the following line or comment it out as shown below.
platform-config/tenants/{tenant-name}/instances/{instance-name}/schemaregistry.sh
# Docker image version, only override if you explicitly want to use a different version
#SCHEMAREGISTRY_VERSION="[OLDER VERSION]"
Step 4b - Restarting Schema-Registry slave
-
Run the following command for each instance where schema-registry slave is running:
axual.sh restart instance <instance-name> sr-slave
The following output is expected:
Axual Platform 2021.1 Stopping instance services for [Instance Name] in cluster [Cluster Name] Stopping [Instance Name]-sr-slave: Stopped Done, cluster-api is available Deploying topic _[Instance Name]-schemas: Done Deploying topic _[Instance Name]-consumer-timestamps: Done Done, cluster-api is available Done, cluster-api is available Applying ACLs : {...} Done ... Applying ACLs : {...} Done Configuring instance services for [Instance Name] in cluster [Cluster Name] Preparing [Instance Name]-sr-slave: Done Starting [Instance Name]-sr-slave: Done
-
Verification after Schema-Registry slave restart - Don’t continue before all the following criteria met:
-
Check the docker logs and make sure there is no error and service is up.
docker logs -f <instance-name>-sr-slave
Step 4c - Restarting Schema-Registry master
-
Run the following command for each instance where schema-registry master is running:
axual.sh restart instance <instance-name> sr-master
The following output is expected:
Axual Platform 2021.1 Stopping instance services for [Instance Name] in cluster [Cluster Name] Stopping [Instance Name]-sr-master: Stopped Done, cluster-api is available Deploying topic _[Instance Name]-schemas: Done Deploying topic _[Instance Name]-consumer-timestamps: Done Done, cluster-api is available Done, cluster-api is available Applying ACLs :... Done ... Applying ACLs :... Done Configuring instance services for [Instance Name] in cluster [Cluster Name] Preparing [Instance Name]-sr-master: Done Starting [Instance Name]-sr-master: Done
-
Verification after Schema-Registry master restart - Don’t continue before all the following criteria met:
-
Check the docker logs and make sure there is no error and service is up.
docker logs -f <instance-name>-sr-master
Step 5 - Upgrade Discovery API to 2.3.2
While Discovery API is not available, client reconfiguration will not happen. |
Make sure the default version in the configuration is not overwritten. If it is, please remove or comment the setting as shown below.
platform-config/tenants/{tenant-name}/instances/{instance-name}/disccovery-api.sh
# Docker image version, only override if you explicitly want to use a different version
#DISCOVERYAPI_VERSION='[OLDER_VERSION]'
Restart the service.
axual.sh restart instance <instance-name> discovery-api
The following output is expected:
Axual Platform 2021.1
Stopping instance services for [Instance Name] in cluster [Cluster Name]
Stopping [Instance Name]-discovery-api: Stopped
Done, cluster-api is available
Deploying topic _[Instance Name]-schemas: Done
Deploying topic _[Instance Name]-consumer-timestamps: Done
Done, cluster-api is available
Done, cluster-api is available
Applying ACLs : {...}
Done
...
Done, cluster-api is available
Applying ACLs : {...}
Done
Configuring instance services for [Instance Name] in cluster [Cluster Name]
Running copy-config-[Instance Name]-discovery-api: Done
Preparing [Instance Name]-discovery-api: Done
Starting [Instance Name]-discovery-api: Done
Step 6 - Upgrading to Distributor 4.0.1
Due to incompatibilities in the internal Connect version, this upgrade cannot be performed in a rolling fashion. The following steps need to be performed on each cluster where you do the upgrade. |
Make sure the default distributor version in the configuration is not overwritten. If you did, please remove the following line or comment it out as shown below.
platform-config/clusters/{cluster-name}/distributor.sh
# Docker image version, only override if you explicitly want to use a different version
#DISTRIBUTOR_VERSION='[OLDER_VERSION]'
Step 6a - Take the cluster out of distribution
-
Move applications to other clusters
For each of your instances, issue the command:
axual.sh instance <instance-name> set status app off
The following output is expected:
Instance <tenant>-<instance> on cluster <tenant>-<cluster> has received state change event DISABLE_APPLICATIONS
Wait until the message rates on your clusters have stabilized and are roughly equal.
-
Disable offset distribution
For each of your instances, issue the command:
axual.sh instance <instance-name> set status offset off
The following output is expected:
Instance <tenant>-<instance> on cluster <tenant>-<cluster> has received state change event DISABLE_OFFSETS
Wait until the incoming load on topic
_<tenant>-<instance>-consumer-timestamps
has settled. See the Distributor Overview dashboard. -
Disable message distribution
For each of your instances, issue the command:
axual.sh instance <instance-name> set status data off
The following output is expected:
Instance <tenant>-<instance> on cluster <tenant>-<cluster> has received state change event DISABLE_DATA
Step 6b - Restart connectors
For each of your instances, issue the command:
axual.sh restart instance <instance-name> distribution
The following output is expected:
Stopping instance services for <instance-name> in cluster <cluster-name>
...
The above command needs to run on from the first distribution node. |
Step 6c - Stop and start distributors
On each node in your distributor cluster, issue the command:
axual.sh stop cluster distributor
The following output is expected:
Axual Platform 2021.1
Stopping cluster services for node [NODE-NAME] in cluster [CLUSTER-NAME]
Stopping distributor: Stopped
After the distributors have stopped, start them again. On each node in your distributor cluster, issue the command:
axual.sh start cluster distributor
The following output is expected:
Axual Platform 2021.1
Configuring cluster services for node [NODE-NAME] in cluster [CLUSTER-NAME]
Done, cluster-api is available
Preparing distributor security: Done
Preparing distributor topics: Deploying topic _distributor-config: Done
Deploying topic _distributor-offset: Done
Deploying topic _distributor-status: Done
Done
Starting distributor: Done
Step 6d - Check the connector status
-
Check Grafana and verify all connectors have
RUNNING
status (on the Distributor Overview dashboard) -
Check if a task has been assigned to the distributor connector: visit
The response should be a JSON payload looking like the following:
{ "name": "[Connector NAME]", "connector": { "state": "RUNNING", "worker_id": "[NODE]:8083" }, "tasks": [ { "id": 0, "state": "RUNNING", "worker_id": "[NODE]:8083" }, { "id": 1, "state": "RUNNING", "worker_id": "[NODE]:8083" } ], "type": "sink" }
If the tasks are not distributed over all the workers, restart the missing worker. |
Step 6e - allow connections to the upgraded cluster
The following steps will trigger the Discovery API to allow the upgraded cluster to be the active cluster for your instances.
-
Enable message distribution
Issue the following command:
axual.sh cluster set status data on
The following output is expected:
Instance <tenant>-<instance> on cluster <tenant>-<cluster> has received state change event ENABLE_DATA ...
Wait for the message rates to stabilize; they should be roughly equal across clusters.
-
Enable offset distribution
Issue the following command:
axual.sh cluster set status offset on
The following output is expected:
Instance <tenant>-<instance> on cluster <tenant>-<cluster> has received state change event ENABLE_OFFSETS ...
Wait until the incoming load on topic
_<tenant>-<instance>-consumer-timestamps
has settled; see the Distributor Overview dashboard. -
Switch on applications
Enter the following command:
axual.sh cluster set status app on
The following output is expected:
Instance <tenant>-<instance> on cluster <tenant>-<cluster> has received state change event ENABLE_APPLICATIONS ...
Step 7 - Upgrade Cluster API to 1.7.2
While the cluster API is not available, topic apply is not possible. |
Cluster API 1.7.2 is incompatible with Instance API 2.1.0, be sure all your instance(s) are running at least Instance API 3.0.1 version. |
Make sure the default version in the configuration is not overwritten. If it is, please remove or comment the setting as shown below.
platform-config/clusters/{cluster-name}/cluster-api.sh
# Docker image version, only override if you explicitly want to use a different version
#CLUSTERAPI_VERSION='[OLDER VERSION]'
Restart the service:
axual.sh restart cluster cluster-api
Step 8 - Upgrade Management API to 6.0.0
Management API 6.0.0 is incompatible with Instance API 2.1.0, be sure all your instance(s) are running at least Instance API 3.0.1 version. |
Make sure the default version in the configuration is not overwritten. If it is, please remove or comment the setting as shown below.
platform-config/clusters/{cluster-name}/mgmt-api.sh
# Docker image version, only override if you explicitly want to use a different version
#MGMT_API_VERSION='[OLDER_VERSION]'
Restart the service
axual.sh restart mgmt mgmt-api
The following output is expected:
Axual Platform 2021.1
Stopping mgmt services for node [NODE NAME] in cluster [CLUSTER NAME]
Stopping mgmt-api: Stopped
Configuring mgmt services for node [NODE NAME] in cluster [CLUSTER NAME]
Testing DB connection
Connection successful
Preparing mgmt-api: Done
Starting mgmt-api: Done
After restart, log in to Management UI and verify whether the upgrade was successful. Hover the "Axual X" in the top right, you can determine the version of Management API and Management UI. This should be 6.0.0 and 5.7.2 respectively. |
Step 9 - Upgrade Management UI to 5.10.0
Make sure the default version in the configuration is not overwritten. If it is, please remove or comment the setting as shown below.
platform-config/clusters/{cluster-name}/mgmt-ui.sh
# Docker image version, only override if you explicitly want to use a different version
#MGMT_UI_VERSION='[OLDER_VERSION]'
Restart the service
axual.sh restart mgmt mgmt-ui
The following output is expected:
Axual Platform 2021.1
Stopping mgmt services for node [NODE NAME] in cluster [CLUSTER NAME]
Stopping mgmt-ui: Stopped
Configuring mgmt services for node [NODE NAME] in cluster [CLUSTER NAME]
Starting mgmt-ui: Done
After restart, log in to Management UI and verify whether the upgrade was successful. Hover the "Axual X" in the top right, you can determine the version of Management API and Management UI. This should be 6.0.0 and 5.10.0 respectively. |
Step 10 - Upgrade Connect to 2.2.4
Make sure the default version in the configuration is not overwritten. If it is, please remove or comment the setting as shown below.
platform-config/tenants/{tenant-name}/instances/{instance-name}/axual-connect.sh
# Docker image version, only override if you explicitly want to use a different version
#CONNECT_VERSION=[SOME_VERSION]
Restart the service
axual.sh restart client <instance-name> axual-connect
The following output is expected (for restarting all the client services):
Axual Platform 2021.1
Loading cluster definition: [Cluster A]
PREFIX for loading cluster definition is: CLUSTER_[Cluster A]_
Loading cluster definition: [Cluster B]
PREFIX for loading cluster definition is: CLUSTER_[Cluster B]_
Loading cluster definition: [Cluster C]
PREFIX for loading cluster definition is: CLUSTER_[Cluster C]_
Analyzing cluster [Cluster C]-inter-broker-listener
Analyzing cluster [Cluster C]-mgmt-api-db
Analyzing cluster [Cluster C]
Hostname: '[Worker Hostname]'. Using configuration for cluster '[Cluster C]', node ID: '1'
Loading tenant definition: [Tenant Name]
Loading tenant definition: axual
Loading instance definition: [Instance A]
Loading instance definition: [Instance B]
Loading cluster definition: [Cluster A]
PREFIX for loading cluster definition is: [Cluster A]
Loading cluster definition: [Cluster B]
PREFIX for loading cluster definition is: [Cluster B]
Loading cluster definition: [Cluster C]
PREFIX for loading cluster definition is: CLUSTER_[Cluster C]_
Analyzing cluster altair-[Cluster C]-inter-broker-listener
Analyzing cluster altair-[Cluster C]-mgmt-api-db
Analyzing cluster altair-[Cluster C]
Hostname: '[Worker Hostname]'. Using configuration for cluster '[Cluster C]', node ID: '1'
Loading tenant definition: [Tenant Name]
Loading tenant definition: axual
Loading instance definition: [Instance A]
Loading instance definition: [Instance B]
Step 11 - Updating Prometheus Targets and Alerts
Restarting Prometheus
Log in to the VM where prometheus
runs and use the following command to recreate targets.json and alerts.json:
axual.sh restart mgmt prometheus
Open the Prometheus UI to check that targets and alerts are all there.
You can search for Management-API as target to confirm new versions have been deployed.
|