Debezium - PostgreSQL source-connector

Debezium - PostgreSQL

If you already have a publicly accessible deployment of PostgreSQL, then you can already configure and install a connector using the Axual Self-Service UI. Use the configuration instructions below.

If you don’t have one available, follow the next section to deploy a publicly available instance of PostgreSQL.

Deploying a PostgreSQL instance

We’ll deploy a PostgreSQL instance using Google Cloud Services.

If you already have a Google account, you can sign up for a free trial of Google cloud here.

You can read more about PostgreSQL on Google cloud here.

Let’s get started.

  1. Create a new PostgreSQL instance here.

  2. The instance ID is irrelevant for Connect, but will be displayed in your Google overview. You will also set the password for the database root user. Remember it, but you can also reset it later.

  3. Select database version "PostgreSQL 14" (likely the default option).

  4. The region is irrelevant, too, but usually you would select a region geographically closest to the Connect-cluster.

  5. Zone availability: "Single" is enough. We’ll aim to have a very lightweight deployment.

  6. Customize your instance: Click "Show configuration options"

    • Machine type - Click the dropdown menu and select "Lightweight": 1vCPU is enough.

    • Storage: go for the least amount of storage.

    • Connections: Leave only "Public IP" selected. Authorized networks. Click "Add network". Use any name and "0.0.0.0/0" as the CIDR notation (and click "Done"). This will open the database to the entire internet. It’s ok, we’ll delete it shortly anyway.

    • Backups: Disable backups and point-in-time-recovery.

    • Flags: cloudsql.logical_decoding : on The effect of this operation is to set the "wal_level" to "logical"

  7. Click "Create instance". Wait for the operation to complete.

  8. While the database server is getting deployed, let’s create a bucket.

    • Give it any name (i.e. "postgresql-init-axual-connect"). Click Continue.

    • Where to store your data: Region (we’re using a single region). Choose any geographic location and click Continue.

    • Default storage class: Standard. Click Continue.

    • Control access to objects: uniform. Click Continue.

    • Protect data: None

  9. Click "Create" to crete the bucket.

  10. The bucket page will open. Click "Upload files"

  11. Save this file locally, and then click "Upload files" to upload it into the bucket:

ALTER USER postgres REPLICATION;

CREATE TABLE CUSTOMERS
(
  user_id        serial         PRIMARY KEY,
  customer_name  VARCHAR (100)  NOT NULL,
  email          VARCHAR(250)
);

INSERT INTO CUSTOMERS (customer_name, email) VALUES
('alin', null),
('frenk', null),
('doxxed', null);
  1. You can close the buckets page. Let’s go back to our SQL instances. Select your PostgreSQL instance to view it. Note down the public IP address. This is the database.hostname you’ll use in your configuration.

  2. On the left-side menu, click "Users". Click the dots next to the postgres user and change its password. We’ll use these credentials to connect to the database.

  3. On the left-side menu, click "Databases". Click "Create Database". Use inventory as the name, as this is the name used in this example document. We referenced it in the SQL snippet above, and we’ll do it again in the connector configuration. Leave the rest of the configuration as it is and click "Create".

  4. On the left-side menu, click "Overview". Click the "Import" tab.

    • Source: click "Browse". Select the SQL file we saved earlier.

    • File format: SQL

    • Destination: inventory

    • Click "Import"

Configuring a new connector

Follow the 2022.1@axual::self-service/stream-management.html.adoc#creating-streams documentation in order to create one stream.

The name of the stream will be <server_nickname>.<schemaName>.<tableName>, where each field corresponds to the table we’ll be watching. In our example, that name turns out to be postgresql_nickname.inventory.customers. The Stream will have Key/Value of type String (configured at stream creation time).

When you deploy the stream onto an environment, you will use the following configuration:

  • retention time and segment.ms of at least 86400000 (1 day).

  • a single (one) partition.

Follow the Configuring Connector-Applications documentation to set up a new Connect-Application. Let’s call it my_customers_app. The plugin name is "io.debezium.connector.postgresql.PostgresConnector". Configure the security certificate as instructed. The values you will need to supply as configuration will be listed in this section.

For advanced configuration, see the official connector documentation.

database.hostname

Example value:

123.123.123.123

database.port

5432

database.user

postgres

database.password

Example value:

password

database.server.name

This is the <nickname> we used as a prefix for the stream we created.

postgresql_nickname

database.dbname

inventory

table.include.list

public.customers

We need to supply additional (custom) properties, with keys which are not automatically available in the self-service UI. We disable automatic topic creation. The first property should be enough. The fact that the following 2 properties are required is a bug within the connect-plugin itself.

topic.creation.enable

false

topic.creation.default.partitions

1

topic.creation.default.replication.factor

3

Authorize the Connector-Application to produce to the postgresql_nickname.public.customers topic.

You can now start the Connector-Application

Cleanup

Once you are done, stop the connector application and cleanup the unused axual resources.

In case you deployed your resources via Google Cloud, don’t forget to delete your bucket and your database once you are done.