Performing the upgrade
Typical upgrade steps
A lot of the upgrade steps can be performed without impact on your users. Basically, the deployment or upgrade of components is split in two actions:
-
Configuration changes, such as added or changed configuration parameters, including the new component’s version
-
Deployment of the upgraded component, by (re)starting
The configuration changes can be done in advance most of the time, limiting downtime for your end users.
In the following upgrade steps, platform-config refers to the location where your platform configuration is stored for that particular environment.
|
Verifying every step of the way
When performing the upgrade, we strongly advise to verify whether things are working every step of the way. It is pointless to continue the upgrade if halfway 1 of the services fail to start. In general, we can give you the following tips that apply to every service when performing (re)starts after an upgrade:
-
Check whether the new docker image version has been pulled successfully
-
Check whether the container actually starts and is at least up for > 30 seconds, and not in "Restarting" mode
There are also verification steps that depend on the service which is being upgraded. Those steps can be found in the upgrade docs itself. |
Performing the upgrade
Step 1 - Setting up Keycloak:11
In the below steps, we are going to set up Keycloak to use the 11.0.2 version.
Step 1a - Configuring Keycloak:11
In platform-config
find the following settings in the mgmt cluster such as
platform-config/clusters/{cluster-name}/keycloak.sh)
and add/edit with the following configuration.
# Version of keycloak to run
KEYCLOAK_VERSION=11.0.2
# Port to listen on
KEYCLOAK_PORT=[NO_CHANGE]
KEYCLOAK_ADMIN_PORT=8993
#Optional, default=false
#KEYCLOAK_IMPORT_REALM=
Step 1b - Stop Keycloak
-
Log in to the node where Keycloak is running and stop Keycloak with the following command:
./axual.sh stop mgmt mgmt-keycloak
-
Remove the old
themes
folder from yourplatform-config/clusters/{cluster-name}/configuration/keycloak
-
Place the new axual-keycloak-theme under your
platform-config/clusters/{cluster-name}/configuration/keycloak
so that thethemes
folder is child ofkeycloak
folder
In case you have own Keycloak:11 themes, please make sure you have migrated your existing theme(s). |
Step 1c - Start Keycloak
Log in to the node where Keycloak is running and restart Keycloak with the following command:
./axual.sh start mgmt mgmt-keycloak
Step 1d - Access Admin Console on the management port
The new version of Keycloak is now exposing the admin console on a separate port to improve security. You can access it as follows:
-
Log in to Keycloak Admin Console, using the UI, go to https://KEYCLOAK_HOSTNAME:KEYCLOAK_ADMIN_PORT/auth. You will see the following login screen.
-
Push Administration Console
-
Enter the KEYCLOAK_USER and KEYCLOAK_PASSWORD stored on your
platform-config
, then press login.
Step 2 - Restart components to get metrics exposed
Step 2a - Configuring Cluster-Browse’s Management Port and Open Endpoints
In platform-config
find the following settings in the mgmt cluster such as
platform-config/clusters/{cluster-name}/cluster-browse.sh)
and add/edit with the following configuration.
# Docker image version, only override if you explicitly want to use a different version
#CLUSTER_BROWSE_VERSION=
#########
# PORTS #
#########
# Port at which the web-server is hosted on the host machine
CLUSTER_BROWSE_PORT=[NO_CHANGE]
# Port at which the management-server is hosted on the hot machine
CLUSTER_BROWSE_MGMT_PORT=9086
# At least `/actuator/health` and `/actuator/prometheus` need to be exposed
CLUSTER_BROWSE_SECURITY_OPEN_ENDPOINTS="/actuator/health,/actuator/prometheus,/actuator/info"
Step 2b - Restarting Cluster-Browse
Log in to the node where Cluster-Browse is running and restart it with the following command:
./axual.sh restart cluster cluster-browse
There is one Cluster-Browse running per cluster. Each of them need to be restarted. |
Check logs to confirm the successful restart
docker logs -f --tail 400 cluster-browse
Step 2c - Configuring Stream-Browse’s Management Port and Open Endpoints
In platform-config
find the following settings in the mgmt cluster such as
platform-config/clusters/{cluster-name}/stream-browse.sh)
and add/edit with the following configuration.
# Docker image version, only override if you explicitly want to use a different version
#STREAM_BROWSE_VERSION=
#########
# PORTS #
#########
# Port at which the web-server is hosted on the host machine
STREAM_BROWSE_PORT=[NO_CHANGE]
# Port at which the management-server is hosted on the hot machine
STREAM_BROWSE_MGMT_PORT=6980
# At least `/actuator/health` and `/actuator/prometheus` need to be exposed
STREAM_BROWSE_SECURITY_OPEN_ENDPOINTS="/actuator/health,/actuator/prometheus,/actuator/info"
Step 2d - Restarting Stream-Browse
Log in to the node where Stream-Browse is running and restart it with the following command:
./axual.sh restart mgmt stream-browse
Check logs to confirm the successful restart
docker logs -f --tail 400 stream-browse
Step 2e - Configuring Instance-API’s Management Port and Open Endpoints
In platform-config
find the following settings in the mgmt cluster such as
platform-config/tenants/{tenant-name}/instances/{instance-name}/instance-api.sh)
and add/edit with the following configuration.
# Docker image version, only override if you explicitly want to use a different version
#INSTANCEAPI_VERSION=
#########
# PORTS #
#########
# Instance-level defined ports are comma separated pairs of "cluster-name,port"
# The port on which Instance-API hosts the spring boot actuator.
INSTANCE_API_MANAGEMENT_SERVER_PORT="<cluster-1>:<cluster-1-mgmt-port>,<cluster-2>:<cluster-2-mgmt-port>"
# Comma separated list (no spaces) which contains the endpoints which will never require
# authentication, even when 2-way TLS is enforced.
INSTANCE_API_OPEN_ENDPOINTS="/actuator/health,/actuator/prometheus,/actuator/info"
Step 2f - Restarting Instance-API
Log in to the node where Instance-API is running and restart it with the following command:
./axual.sh restart instance [your-instance-name] instance-api
Change [your-instance-name] with something like demo-local
|
Check logs to confirm the successful restart
docker logs -f --tail 400 [your-instance-name]-instance-api
Step 2g - Configuring Operation-Manager’s Management Port and Open Endpoints
In platform-config
find the following settings in the mgmt cluster such as
platform-config/clusters/{cluster-name}/operation-manager.sh)
and add/edit with the following configuration.
# Docker image version, only override if you explicitly want to use a different version
#OPERATION_MANAGER_VERSION=
#########
# PORTS #
#########
# Port at which the web-server is hosted on the host machine
OPERATION_MANAGER_PORT=[NO_CHANGE]
# Port at which the management-server is hosted on the hot machine
OPERATION_MANAGER_MGMT_PORT=37779
# At least `/actuator/health` and `/actuator/prometheus` need to be exposed
OPERATION_MANAGER_OPEN_ENDPOINTS="/actuator/info,/actuator/health,/actuator/prometheus"
Step 2h - Restarting Operation-Manager
Log in to the node where Operation-Manager is running and restart it with the following command:
./axual.sh restart mgmt operation-manager
Check logs to confirm the successful restart
docker logs -f --tail 400 operation-manager
Step 2i - Configuring Management-API’s Management Port and Open Endpoints
In platform-config
find the following settings in the mgmt cluster such as
platform-config/clusters/{cluster-name}/mgmt-api.sh)
and add/edit with the following configuration.
# Docker image version, only override if you explicitly want to use a different version
#MGMT_API_VERSION=
#########
# PORTS #
#########
# Port at which the web-server is hosted on the host machine
MGMT_API_PORT=[NO_CHANGE]
# Port at which the management-server is hosted on the hot machine
MGMT_API_MGMT_PORT=8096
# At least `/actuator/health` and `/actuator/prometheus` need to be exposed
MGMT_API_OPEN_ENDPOINTS="/actuator/info,/actuator/health,actuator/prometheus"
Step 3 - Upgrading to Broker 5.4.0
In this step we are going to set up Broker to use the 5.4.0 version (Apache Kafka 2.6.0).
Step 3a - Configuring Broker
Make sure you didn’t override the default value before. If you did please remove the following line or comment it out as shown below.
# Docker image version, only override if you explicitly want to use a different version
#BROKER_VERSION=[OLDER VERSION]
Step 3b - Restarting Brokers
To limit the impact on the users restart brokers in a rolling fashion (one node at a time).
Restart the |
-
Restarting Broker:
platform-deploy./axual.sh restart cluster broker
The following output is expected (for a 3 clusters step)
Axual Platform 2020.3 Loading cluster definition: [Cluster A] PREFIX for loading cluster definition is: CLUSTER_[Tenant]_[Cluster A]_ Loading cluster definition: [Cluster B] PREFIX for loading cluster definition is: CLUSTER_[Tenant]_[Cluster B]_ Loading cluster definition: [Cluster C] PREFIX for loading cluster definition is: CLUSTER_[Tenant]_[Cluster C]_ Analyzing cluster [Cluster B]-inter-broker-listener Analyzing cluster [Cluster B]-mgmt-api-db Analyzing cluster [Cluster B] Analyzing cluster [Cluster A]-inter-broker-listener Analyzing cluster [Cluster A] Analyzing cluster [Cluster C]-inter-broker-listener Analyzing cluster [Cluster C] Hostname: '[Worker Hostname]'. Using configuration for cluster '[Cluster C]', node ID: '1' Loading tenant definition: axual Loading tenant definition: [Tenant] Loading instance definition: [Tenant]-[Instance A] Loading instance definition: [Tenant]-[Instance B] Stopping cluster services for node [Worker Hostname] in cluster [Cluster C] ... Stopped Loading cluster definition: [Cluster A] PREFIX for loading cluster definition is: CLUSTER_[Tenant]_[Cluster A]_ Loading cluster definition: [Cluster B] PREFIX for loading cluster definition is: CLUSTER_[Tenant]_[Cluster B]_ Loading cluster definition: [Cluster C] PREFIX for loading cluster definition is: CLUSTER_[Tenant]_[Cluster C]_ Analyzing cluster [Cluster B]-inter-broker-listener Analyzing cluster [Cluster B]-mgmt-api-db Analyzing cluster [Cluster B] Analyzing cluster [Cluster A]-inter-broker-listener Analyzing cluster [Cluster A] Analyzing cluster [Cluster C]-inter-broker-listener Analyzing cluster [Cluster C] Hostname: '[Worker Hostname]'. Using configuration for cluster '[Cluster C]', node ID: '1' Loading tenant definition: axual Loading tenant definition: [Tenant] Loading instance definition: [Tenant]-[Instance A] Loading instance definition: [Tenant]-[Instance B] Configuring cluster services for node [Worker Hostname] in cluster [Cluster C] ... Preparing broker: Done Param 1: run Param 2: broker Param 3: axual/broker:5.4.0 Param 4: Starting broker Param 5: Done Param 6: -d --tty=false --restart=always -v /appl/kafka/config/broker:/config/broker ... Param 7: ... Done
-
Verification After Broker Restart - use Grafana Dashboards, don’t continue before all the following criteria met:
-
On
Cluster Overview
Dashboard - VerifyUnder Replicated Partitions
reached 0. -
On
Broker stats per node
Dashboard - Verify all the metrics pulled and visible.
Under-replicated partitions means that one or more replicas are not available. |
Step 4 - Upgrading to Distributor 3.6.3
In the below steps, we are going to set up Distributor to use the 3.6.3 version.
Step 4a - Configuring Distributor
Make sure you didn’t override the default value before. If you did please remove the following line or comment it out as shown below.
# Docker image version, only override if you explicitly want to use a different version
#DISTRIBUTOR_VERSION=[OLDER VERSION]
Step 4b - Restarting Distributor
To limit the impact on the users restart Distributor in a rolling fashion (one node at a time).
-
Restarting Distributor:
platform-deploy./axual.sh restart cluster distributor
The following output is expected:
Axual Platform 2020.3 Stopping cluster services for node [Worker Hostname] in cluster [Cluster A] Stopping distributor: Stopped Configuring cluster services for node [Worker Hostname] in cluster [Cluster A] Done, cluster-api is available Preparing distributor security: Done Preparing distributor topics: Deploying topic _distributor-config: Done Deploying topic _distributor-offset: Done Deploying topic _distributor-status: Done Done Starting distributor: Done
-
Verification After Distributor Restart - use Grafana Dashboards, don’t continue before all the following criteria met:
-
On
Distributor Overview
Dashboard - Verify all connectors onRUNNING
status.
Step 5 - Upgrading to Connect 2.2.3
Connect is located in the new client
services. These are applications that can connect to any cluster in the instance.
Step 5a - Configuring Connect
Make sure you didn’t override the default value before. If you did please remove the following line or comment it out as shown below.
# Uncomment and set a value to load a different version of Axual Connect
#CONNECT_VERSION=[OLDER VERSION]
Step 5b - Restarting Connect
-
Run the following command for each instance:
./axual.sh restart client <instance-name> axual-connect
or simply use the following command to start all client services for all instances on the machine.
./axual.sh restart client
The following output is expected (for restarting all the client services):
Axual Platform 2020.3 Loading cluster definition: [Cluster A] PREFIX for loading cluster definition is: CLUSTER_[Cluster A]_ Loading cluster definition: [Cluster B] PREFIX for loading cluster definition is: CLUSTER_[Cluster B]_ Loading cluster definition: [Cluster C] PREFIX for loading cluster definition is: CLUSTER_[Cluster C]_ Analyzing cluster [Cluster C]-inter-broker-listener Analyzing cluster [Cluster C]-mgmt-api-db Analyzing cluster [Cluster C] Hostname: '[Worker Hostname]'. Using configuration for cluster '[Cluster C]', node ID: '1' Loading tenant definition: [Tenant Name] Loading tenant definition: axual Loading instance definition: [Instance A] Loading instance definition: [Instance B] Loading cluster definition: [Cluster A] PREFIX for loading cluster definition is: [Cluster A] Loading cluster definition: [Cluster B] PREFIX for loading cluster definition is: [Cluster B] Loading cluster definition: [Cluster C] PREFIX for loading cluster definition is: CLUSTER_[Cluster C]_ Analyzing cluster altair-[Cluster C]-inter-broker-listener Analyzing cluster altair-[Cluster C]-mgmt-api-db Analyzing cluster altair-[Cluster C] Hostname: '[Worker Hostname]'. Using configuration for cluster '[Cluster C]', node ID: '1' Loading tenant definition: [Tenant Name] Loading tenant definition: axual Loading instance definition: [Instance A] Loading instance definition: [Instance B]
-
Verification After Connect Restart - use Grafana Dashboards, don’t continue before all the following criteria met:
-
On
Axual Connect Dashboard
Verify per Instance/Connector.
Step 6 - Upgrading to REST Proxy 1.2.2
In the below steps, we are going to set up REST Proxy to use the 1.2.2 version.
Step 6a - Configuring REST Proxy
Make sure you didn’t override the default value before. If you did please remove the following line or comment it out as shown below.
# Docker image version, only override if you explicitly want to use a different version
#RESTPROXY_VERSION="[OLDER VERSION]"
Step 6b - Restarting REST Proxy
-
Run the following command for each instance:
./axual.sh restart instance <instance-name> rest-proxy
The following output is expected:
Axual Platform 2020.3 Stopping instance services for [Instance Name]n in cluster [Cluster Name] Stopping [INSTANCE NAME]-rest-proxy: Stopped Done, cluster-api is available Deploying topic _[Instance Name]-schemas: Done Deploying topic _[Instance Name]-consumer-timestamps: Done Done, cluster-api is available Done, cluster-api is available Applying ACLs :... Done ... Applying ACLs :... Done Configuring instance services for [Instance Name] in cluster [Cluster Name] Preparing [INSTANCE NAME]-rest-proxy: Done Warning: 'ADVERTISED_DEBUG_PORT_REST_PROXY' is not an instance variable and a default value is not configured as 'DEFAULT_INSTANCE_ADVERTISED_DEBUG_PORT_REST_PROXY' Starting [Instance Name]-rest-proxy: Done
-
Verification After Rest-Proxy Restart - use Grafana Dashboards, don’t continue before all the following criteria met:
-
On
Rest-Proxy Detailed Overview
Verify per InstanceUptime
andStart time
correct.
Step 7 - Upgrading to Discovery API 2.1.0
In the below steps, we are going to set up Discovery API to use the 2.1.0 version.
Step 7a - Configuring Discovery API
Make sure you didn’t override the default value before. If you did please remove the following line or comment it out as shown below.
# Docker image version, only override if you explicitly want to use a different version
#DISCOVERYAPI_VERSION="[OLDER VERSION]"
Step 7b - Restarting Discovery API
-
Run the following command on every node which Discovery API running on.
./axual.sh restart instance <instance-name> discovery-api
The following output is expected:
Axual Platform 2020.3 Stopping instance services for [Instance Name] in cluster [Cluster Name] Stopping [Instance Name]-discovery-api: Stopped Done, cluster-api is available Deploying topic _[Instance Name]-schemas: Done Deploying topic _[Instance Name]-consumer-timestamps: Done Done, cluster-api is available Done, cluster-api is available Applying ACLs : {...} Done ... Done, cluster-api is available Applying ACLs : {...} Done Configuring instance services for [Instance Name] in cluster [Cluster Name] Running copy-config-[Instance Name]-discovery-api: Done Preparing [Instance Name]-discovery-api: Done Starting [Instance Name]-discovery-api: Done
-
Verification After Discovery API Restart - use Grafana Dashboards, don’t continue before all the following criteria met:
-
On
Discovery status
Verify per nodeStatus
isActive
.
Step 8 - Updating Prometheus Targets and Alerts
Restarting Prometheus
Log in to the VM where prometheus
runs and use the following command to recreate targets.json and alerts.json:
./axual.sh restart mgmt prometheus
Open the Prometheus UI to check that targets and alerts are all there.
You can search for Management-API as target to confirm new versions have been deployed.
|
Rollback of Keycloak
In case something went wrong with the upgrade to Keycloak:11, follow this steps to rollback to Keycloak:6 |
Before you start the rollback procedure, please make sure:
-
you have stopped your mgmt-keycloak container that was failing. Use
docker ps
to check the current running containers.
Run ./axual.sh stop mgmt mgmt-keycloak to stop the container
|
-
you have transfer the Keycloak’s DB back to the node where mgmt-db runs, it will be use for importing the data and structure of the DB.
If you are using a remote_db be sure you are performing the following procedure by a node that has connectivity to the remote DB. |
Rollback story: Keycloak:11 is not working
Cleanup the Keycloak’s DB
-
Access Keycloak’s DB with a phpmyadmin instance
-
Truncate all the keycloak’s tables, the privileges for the Keycloak’s DB must be maintained
-
Import the Keycloak:6 DB backup
-
Change your
platform-config/clusters/{cluster-name}/keycloak.sh
with the following version# Version of keycloak to run KEYCLOAK_VERSION=6.0.1
-
Remove the new
themes
folder from yourplatform-config/clusters/{cluster-name}/configuration/keycloak
-
Place the old axual-keycloak-theme under your
platform-config/clusters/{cluster-name}/configuration/keycloak
so that thethemes
folder is child ofkeycloak
folder -
Start the Keycloak 6.0.1 container
./axual.sh start mgmt mgmt-keycloak
-
Check logs to confirm the successful rollback of Keycloak
docker logs -f --tail 400 mgmt-keycloak
Rollback story: Keycloak:11 is not working and your DB is corrupted
Your keycloak-db is gone corrupted, so you have to recreate from scratch.
-
Revert your keycloak_version on
platform-config/clusters/{cluster-name}/keycloak.sh
# Version of keycloak to run KEYCLOAK_VERSION=6.0.1
-
Remove the new
themes
folder from yourplatform-config/clusters/{cluster-name}/configuration/keycloak
-
Place the old axual-keycloak-theme under your
platform-config/clusters/{cluster-name}/configuration/keycloak
so that thethemes
folder is child ofkeycloak
folder -
Edit your
platform-config/{cluster-name}/nodes.sh
by adding the service keycloak-populate-db on your nodes’s mgmt_servicesLike this:
NODE1_MGMT_SERVICES=localhost:mgmt-db,keycloak-populate-db
This will re-create a clean keycloakdb to be used to import the exported Keycloak:6 DB
-
Execute the axual.sh command to recreate the DB
./axual.sh start mgmt keycloak-populate-db
-
Import the Keycloak:6 db with all data and structure via mysql
docker exec -i mgmt-db mysql -uKEYCLOAK_DB_USER -pKEYCLOAK_DB_PASSWORD KEYCLOAK_DB_DATABASE < [path/to/sql/backup]
If no errors you can now rollback
./axual.sh start mgmt mgmt-keycloak
-
Check logs to confirm the successful rollback of Keycloak
docker logs -f --tail 400 mgmt-keycloak