Size in Records

The metric provides the number of messages present in a partition of desired stream on each Kafka broker at any given time.

Use cases

Is data evenly distributed between partitions?

If data is not evenly distributed, then the parallelization benefits of Kafka are not being used

Make a request for every partition, compare values and see if any partition is ahead. If it is - figure out how to improve producer’s message key.

Basic usage

Please refer to the example partition size in records metric provided in the API docs


This request is asking for the size in messages of the partition 0 of stream payment-events-stream on environment dev between 2022-10-18T12:20:00Z and 2022-10-18T13:00:00Z with the step-size of 10 minutes.

Basic Request
{
  "metric": "io.axual.partition/size_records",
  "stepSize": "PT10M",
  "timeWindow": "2022-10-18T12:20:00Z/2022-10-18T13:00:00Z",
  "filter": {
    "type": "AND",
    "filters": [
      {
        "type": "FIELD",
        "field": "environment",
        "operation": "EQUALS",
        "value": "dev"
      },
      {
        "type": "FIELD",
        "field": "stream",
        "operation": "EQUALS",
        "value": "payment-events-stream"
      },
      {
        "type": "FIELD",
        "field": "partition",
        "operation": "EQUALS",
        "value": "0"
      }
    ]
  }
}

The below part of sample response, represents the size of partition 0 on each kafka broker (pod) in messages.

Basic Response
{
  "type": "UNGROUPED",
  "dataPoints": [
    {
      "timestamp": "2022-10-18T12:30:00",
      "value": 0,
      "labels": {
        "axual_cluster": "jupiter",
        "partition": "0",
        "pod": "jupiter-kafka-0",
        "topic": "axual-qa-dev-payment-events-stream"
      },
      "unit": null
    },
    {
      "timestamp": "2022-10-18T12:40:00",
      "value": 0,
      "labels": {
        "axual_cluster": "jupiter",
        "partition": "0",
        "pod": "jupiter-kafka-0",
        "topic": "axual-qa-dev-payment-events-stream"
      },
      "unit": null
    },
    {
      "timestamp": "2022-10-18T12:50:00",
      "value": 270,
      "labels": {
        "axual_cluster": "jupiter",
        "partition": "0",
        "pod": "jupiter-kafka-0",
        "topic": "axual-qa-dev-payment-events-stream"
      },
      "unit": null
    },
    {
      "timestamp": "2022-10-18T13:00:00",
      "value": 630,
      "labels": {
        "axual_cluster": "jupiter",
        "partition": "0",
        "pod": "jupiter-kafka-0",
        "topic": "axual-qa-dev-payment-events-stream"
      },
      "unit": null
    },
    {
      "timestamp": "2022-10-18T12:30:00",
      "value": 0,
      "labels": {
        "axual_cluster": "jupiter",
        "partition": "0",
        "pod": "jupiter-kafka-2",
        "topic": "axual-qa-dev-payment-events-stream"
      },
      "unit": null
    },
    {
      "timestamp": "2022-10-18T12:40:00",
      "value": 0,
      "labels": {
        "axual_cluster": "jupiter",
        "partition": "0",
        "pod": "jupiter-kafka-2",
        "topic": "axual-qa-dev-payment-events-stream"
      },
      "unit": null
    },
    {
      "timestamp": "2022-10-18T12:50:00",
      "value": 270,
      "labels": {
        "axual_cluster": "jupiter",
        "partition": "0",
        "pod": "jupiter-kafka-2",
        "topic": "axual-qa-dev-payment-events-stream"
      },
      "unit": null
    },
    {
      "timestamp": "2022-10-18T13:00:00",
      "value": 630,
      "labels": {
        "axual_cluster": "jupiter",
        "partition": "0",
        "pod": "jupiter-kafka-2",
        "topic": "axual-qa-dev-payment-events-stream"
      },
      "unit": null
    }
  ]
}

This metric could be used to know how many messages stored in each partition, how they distributed among partitions and how many messages are produced per time unit.

Advanced usage

Using aggregator

By adding aggregator to the request, the size of the partition will be aggregated over all kafka brokers replicas.

For instance asking for the max or avg aggregation function, will lead to get the total size in messages in a partition, without replicas.

Request using max aggregator
{
  "metric": "io.axual.partition/size_records",
  "stepSize": "PT10M",
  "timeWindow": "2022-10-18T12:20:00Z/2022-10-18T13:00:00Z",
  "aggregator": "max",
  "filter": {
    "type": "AND",
    "filters": [
      {
        "type": "FIELD",
        "field": "environment",
        "operation": "EQUALS",
        "value": "dev"
      },
      {
        "type": "FIELD",
        "field": "stream",
        "operation": "EQUALS",
        "value": "payment-events-stream"
      },
      {
        "type": "FIELD",
        "field": "partition",
        "operation": "EQUALS",
        "value": "0"
      }
    ]
  }
}

The below response represents the size in messages of the partition on a Kafka cluster, without replicas.

Response using max aggregator
{
  "type": "UNGROUPED",
  "dataPoints": [
    {
      "timestamp": "2022-10-18T12:30:00",
      "value": 0,
      "labels": {},
      "unit": null
    },
    {
      "timestamp": "2022-10-18T12:40:00",
      "value": 0,
      "labels": {},
      "unit": null
    },
    {
      "timestamp": "2022-10-18T12:50:00",
      "value": 270,
      "labels": {},
      "unit": null
    },
    {
      "timestamp": "2022-10-18T13:00:00",
      "value": 630,
      "labels": {},
      "unit": null
    }
  ]
}

Using groupBy

If you want to get response grouped by some label - you can use groupBy

Request using groupBy
{
  "metric": "io.axual.partition/size_records",
  "stepSize": "PT10M",
  "timeWindow": "2022-10-18T12:20:00Z/2022-10-18T13:00:00Z",
  "groupBy": [
    "pod"
  ],
  "filter": {
    "type": "AND",
    "filters": [
      {
        "type": "FIELD",
        "field": "environment",
        "operation": "EQUALS",
        "value": "dev"
      },
      {
        "type": "FIELD",
        "field": "stream",
        "operation": "EQUALS",
        "value": "payment-events-stream"
      },
      {
        "type": "FIELD",
        "field": "partition",
        "operation": "EQUALS",
        "value": "0"
      }
    ]
  }
}

The below response represents the size in bytes of the partition on each Kafka broker (pod), with data grouped by pod.

Response using groupBy
{
  "type": "GROUPED",
  "groups": [
    {
      "labels": {
        "pod": "jupiter-kafka-0"
      },
      "dataPoints": [
        {
          "timestamp": "2022-10-18T12:30:00",
          "value": 0,
          "labels": {
            "axual_cluster": "jupiter",
            "partition": "0",
            "pod": "jupiter-kafka-0",
            "topic": "axual-qa-dev-payment-events-stream"
          },
          "unit": null
        },
        {
          "timestamp": "2022-10-18T12:40:00",
          "value": 0,
          "labels": {
            "axual_cluster": "jupiter",
            "partition": "0",
            "pod": "jupiter-kafka-0",
            "topic": "axual-qa-dev-payment-events-stream"
          },
          "unit": null
        },
        {
          "timestamp": "2022-10-18T12:50:00",
          "value": 270,
          "labels": {
            "axual_cluster": "jupiter",
            "partition": "0",
            "pod": "jupiter-kafka-0",
            "topic": "axual-qa-dev-payment-events-stream"
          },
          "unit": null
        },
        {
          "timestamp": "2022-10-18T13:00:00",
          "value": 630,
          "labels": {
            "axual_cluster": "jupiter",
            "partition": "0",
            "pod": "jupiter-kafka-0",
            "topic": "axual-qa-dev-payment-events-stream"
          },
          "unit": null
        }
      ]
    },
    {
      "labels": {
        "pod": "jupiter-kafka-2"
      },
      "dataPoints": [
        {
          "timestamp": "2022-10-18T12:30:00",
          "value": 0,
          "labels": {
            "axual_cluster": "jupiter",
            "partition": "0",
            "pod": "jupiter-kafka-2",
            "topic": "axual-qa-dev-payment-events-stream"
          },
          "unit": null
        },
        {
          "timestamp": "2022-10-18T12:40:00",
          "value": 0,
          "labels": {
            "axual_cluster": "jupiter",
            "partition": "0",
            "pod": "jupiter-kafka-2",
            "topic": "axual-qa-dev-payment-events-stream"
          },
          "unit": null
        },
        {
          "timestamp": "2022-10-18T12:50:00",
          "value": 270,
          "labels": {
            "axual_cluster": "jupiter",
            "partition": "0",
            "pod": "jupiter-kafka-2",
            "topic": "axual-qa-dev-payment-events-stream"
          },
          "unit": null
        },
        {
          "timestamp": "2022-10-18T13:00:00",
          "value": 630,
          "labels": {
            "axual_cluster": "jupiter",
            "partition": "0",
            "pod": "jupiter-kafka-2",
            "topic": "axual-qa-dev-payment-events-stream"
          },
          "unit": null
        }
      ]
    }
  ]
}