KSML Provisioner 0.5.0 Readme

Overview

This is a REST application used to provision KSML applications in Kubernetes.

When a deployment is requested, this provisioner fetches the KSML Helm Charts from the configured Helm registry, generates values.yaml based on the requested configuration and deploys it in the same Kubernetes cluster where this provisioner is deployed.

The Provisioner requires certain privileges scoped to the namespace in Kubernetes to deploy KSML pod including a ServiceMonitor resource for Prometheus monitoring.

Limitations

  1. The Provisioner can only deploy KSML applications in the same Kubernetes cluster where it is deployed.

  2. The Provisioner can pull charts from OCI-compatible registries only. Nexus registry is not supported.

Required Configuration

Environment Variables Description

NAMESPACE

The Kubernetes namespace where KSML application will be deployed.

REGISTRY_URL

The Helm Chart Registry URL where KSML Helm Charts are pulled from.

CHART_NAME

The name of the KSML Helm Chart.

CHART_VERSION

The version of the KSML Helm Chart.

Optional Configuration

Environment Variables Description

REGISTRY_AUTH_ENABLED

Set to true if Helm Chart registry requires authentication. Default false.

REGISTRY_USERNAME

Username of the Helm Chart Registry defined in REGISTRY_URL.

REGISTRY_PASSWORD

Password of the Helm Chart Registry defined in REGISTRY_URL.

CUSTOM_VALUES_FILE

Path to a custom values file containing extra configurations for KSML deployment. This is useful when configs like resources, securityContext, topologySpreadConstraints etc need to be passed.

CLIENT_CA_FILE

Path to a base64-encoded PEM file containing one or more CA certificates. It is used to validate the Helm Registry’s server certificate.

INSECURE_SKIP_TLS_VERIFY

If true, the validation of Helm Registry’s server certificate. This can lead to man in the middle attacks and should not be used in Production.

DISTRIBUTED_TRACING_ENABLED

If true, enables distributed tracing with OpenTelemetry. To configure the exporter, see section “Enable Distributed Tracing.” Default false.

BASIC_AUTH_ENABLED

Set to true if HTTP server should have basic authentication. Default false.

BASIC_AUTH_USERNAME

Username of the basic authentication of HTTP server.

BASIC_AUTH_PASSWORD

Password of the basic authentication of HTTP server.

SERVER_TLS_CERTIFICATE

PEM encoded x509 certificate for HTTP server.

SERVER_TLS_PRIVATE_KEY

PEM encoded private key for HTTP server.

Required Kubernetes Permissions

KSML Provisioner requires a certain set of permissions on the service account to deploy KSML applications. These permissions are described below.

The required service account, role and role binding are automatically created by the Provisioner Helm Chart. No special configurations are required.

Kubernetes Resource API Group Permissions

configmaps

GET, LIST, WATCH, CREATE, PATCH, UPDATE, DELETE

pods

GET, LIST, WATCH, CREATE, PATCH, UPDATE, DELETE

pods

metrics.k8s.io

GET, LIST

pods/log

GET, LIST, WATCH, CREATE, PATCH, UPDATE, DELETE

secrets

GET, LIST, WATCH, CREATE, PATCH, UPDATE, DELETE

serviceaccounts

GET, LIST, WATCH, CREATE, PATCH, UPDATE, DELETE

services

GET, LIST, WATCH, CREATE, PATCH, UPDATE, DELETE

statefulsets

apps

GET, LIST, WATCH, CREATE, PATCH, UPDATE, DELETE

jobs

batch

GET, LIST, WATCH, CREATE, PATCH, UPDATE, DELETE

prometheusrules

monitoring.coreos.com

GET, LIST, WATCH, CREATE, PATCH, UPDATE, DELETE

servicemonitors

monitoring.coreos.com

GET, LIST, WATCH, CREATE, PATCH, UPDATE, DELETE

ingresses

networking.k8s.io

GET, LIST, WATCH, CREATE, PATCH, UPDATE, DELETE

networkpolicies

networking.k8s.io

GET, LIST, WATCH, CREATE, PATCH, UPDATE, DELETE

persistentvolumeclaims

storage.k8s.io

GET, LIST, WATCH, CREATE, PATCH, UPDATE, DELETE

Provisioning KSML applications in different namespace

By default, KSML applications will be provisioned in the same namespace where the provisioner is deployed. If the applications need to be deployed in a different namespace (eg ksml), pass below values.

env:
  - name: NAMESPACE
    value: "ksml"

rbac:
  namespace: "ksml"

The NAMESPACE environment variable informs the provisioner about where to deploy KSML applications. The rbac.namespace ensures that the provisioner has the necessary permissions to deploy KSML applications in the target namespace.

Bring your own ServiceAccount and RBAC

By default, the helm chart deploys a service account and associated RBAC resources with correct permissions. But for some reason, if you want to use your own service account and RBAC resources, set below configuration:

serviceAccount:
  create: false
  name: my-custom-sa
rbac:
  create: false

A custom service account can be passed via serviceAccount.name. It is your responsibility to ensure this service account has correct permissions to deploy KSML applications (see previous section for exact permissions). The easiest way to do this is to create a Role with relevant permissions and a RoleBinding to associate the role with the service account.

In most cases though, you don’t need this and can keep life simple by having rbac.create: true and let the Helm chart take care of the complexity.

Enable Distributed Tracing

KSML Provisioner supports distributed tracing with OpenTelemetry. To enable, set environment variable DISTRIBUTED_TRACING_ENABLED to true.

To export traces to an OTel backend, configure below environment variables.

Environment Variable Description

OTEL_EXPORTER_OTLP_ENDPOINT

Endpoint of OTel collector. Must start with http/https and include port number.

OTEL_EXPORTER_OTLP_HEADERS

Optional headers.

OTEL_SERVICE_NAME

A unique name to recognize the service.

Enable Basic Authentication on the HTTP server

Basic authentication can be enabled by supplying the below configuration:

Environment Variable Description

BASIC_AUTH_ENABLED

Set to “true” to enable Basic authentication. Default is “false”.

BASIC_AUTH_USERNAME

Username for Basic authentication. Required when BASIC_AUTH_ENABLED is “true”.

BASIC_AUTH_PASSWORD

Password for Basic authentication. Required when BASIC_AUTH_ENABLED is “true”.

Below is an example configuration:

env:
  - name: BASIC_AUTH_ENABLED
    value: "true"
  - name: BASIC_AUTH_USERNAME
    value: "username"
  - name: BASIC_AUTH_PASSWORD
    value: "password"

Enable TLS on the HTTP server

TLS can be enabled by supplying the below configuration:

Environment Variable Description

SERVER_TLS_CERTIFICATE

Certificate in PEM format.

SERVER_TLS_PRIVATE_KEY

Private key in PEM format.

Below is an example configuration:

env:
  - name: SERVER_TLS_CERTIFICATE
    valueFrom:
      secretKeyRef:
        key: tls.crt
        name: ksml-provisioner
  - name: SERVER_TLS_PRIVATE_KEY
    valueFrom:
      secretKeyRef:
        key: tls.key
        name: ksml-provisioner

Note that in the above example, the secret ksml-provisioner must exist with certificate in tls.crt and private key in tls.key.

Setting fsGroup for KSML PersistentVolume Access

When using a PersistentVolume (PV) for KSML Stateful feature, the volume is mounted at /ksml-store path which is owned by the root user. Without additional configuration, the KSML process running in the Pod cannot create the required subfolder /ksml-store/data.

To ensure the KSML process can write to the volume, set podSecurityContext.fsGroup so that the files in the volume are owned by the group of the user running the KSML process. The default user is part of group 0 so set fsGroup to 0 unless you are using a different user. For example:

ksml-provisioner:
  customValues:
    podSecurityContext:
      fsGroup: 0

This ensures that the KSML Pod can create and manage the /ksml-store/data folder inside the PV automatically.

REST API endpoints

POST /deploy

Request Body:

{
  "tenant": "dizzl",
  "instance": "sandbox",
  "environment": "dev",
  "application": "sensor-inspector",
  "replicas": 1,
  "definition": "{streams: {sensor_source_avro: {topic: ksml_sensordata_avro, keyType: string, valueType: 'avro:SensorData'}}, functions: {log_message: {type: forEach, parameters: [{name: format, type: string}], code: 'log.info(\"Consumed {} message - key={}, value={}\", format, key, value)'}}, pipelines: {consume_avro: {from: sensor_source_avro, forEach: {code: 'log_message(key, value, format=\"AVRO\")'}}}}",
  "kafkaProperties": {
    "bootstrap.servers": "broker:9093",
    "schema.registry.url": "http://schemaregistry:8081",
    "application.id": "io.axual.sensor-inspector"
  },
  "schemas": [
    {
      "schemaFileName": "SensorData.avsc",
      "schemaBody": "..."
    }
  ]
}

Returns 200 OK if deployment is successful.

GET /status?tenant=dizzl&instance=sandbox&environment=dev&application=sensor-inspector

Returns the status of KSML deployment if available.

The status can be one of Running, Starting and Failing.

{
  "helm": {
    "name": "dizzl-sandbox-dev-sensor-inspector-ksml",
    "chart": "ksml",
    "version": "0.0.0-snapshot",
    "appVersion": "snapshot",
    "namespace": "default",
    "revision": 1,
    "status": "deployed",
    "deployedAt": "2024-08-26T13:04:20Z"
  },
  "kubernetes": {
    "name": "dizzl-sandbox-dev-sensor-inspector-ksml",
    "status": "Starting",
    "replicas": 1,
    "readyReplicas": 0
  }
}

GET /logs?tenant=dizzl&instance=sandbox&environment=dev&application=sensor-inspector&tailLines=10

Returns last N lines of logs for the KSML deployment if available

Sample response

{
  "tailLines": 1,
  "logs": [
    {
      "replicaName": "0",
      "lines": "2024-07-16T13:06:39,394Z [io.axual.sensor-inspector-6a395bc1-f3bc-4eb2-9331-c3d1b4e45f8b-StreamThread-1] INFO  ksml.function.log_message - Consumed AVRO message - key=sensor27, value={'city': 'Xanten'}\n"
    },
    {
      "replicaName": "1",
      "lines": "2024-07-16T13:06:39,841Z [io.axual.sensor-inspector-1393ae07-9321-424d-9453-b86383c816b3-StreamThread-1] INFO  ksml.function.log_message - Consumed AVRO message - key=sensor28, value={'city': 'Amsterdam'}\n"
    }
  ]
}

GET /metrics?tenant=dizzl&instance=sandbox&environment=dev&application=sensor-inspector&deploymentMode=StatefulSet

Returns the current CPU and memory usage metrics for the KSML deployment. This endpoint requires the Kubernetes Metrics Server to be installed in the cluster.

Query Parameters: - tenant (required): The tenant name - instance (required): The instance name - environment (required): The environment name - application (required): The application name - deploymentMode (optional): Either StatefulSet or Job

Sample response:

{
  "cpu": {
    "usedMillicores": "250"
  },
  "memory": {
    "usedMb": "512"
  }
}

Note: If the Metrics Server is not installed or metrics are not yet available for a newly started pod, the endpoint will return a 404 error with an appropriate message.

POST /undeploy

Request Body:

{
  "tenant": "dizzl",
  "instance": "sandbox",
  "environment": "dev",
  "application": "sensor-inspector"
}

Returns 200 OK if removal of deployment is successful