Skip to main content
Version: 12.10.0

Strimzi Operator based Kafka

Apache Kafka is a distributed streaming platform that enables you to build real-time data pipelines and streaming applications. Strimzi provides a way to run an Apache Kafka cluster on Kubernetes in various deployment configurations. It simplifies the process of running Kafka on Kubernetes by using the Operator pattern

Strimzi provides container images and operators for running Kafka on Kubernetes. Strimzi operators are purpose-built with specialist operational knowledge to effectively manage Kafka on Kubernetes.

Advantages of Strimzi Operator-Based Kafka

Simplified Kafka Management: Strimzi Operators extend Kubernetes to automate the management, installation, upgrades, and configuration of Kafka clusters.

Kubernetes Native: Strimzi is built to be Kubernetes native, which means it leverages Kubernetes features and follows best practices.

Scalability: Easily scale Kafka clusters up or down to meet demand without manual intervention.

Fault Tolerance: Strimzi ensures high availability of Kafka clusters, handling failures gracefully.

Automated Updates: Operators can automate the Kafka software update process, reducing the risk of human error.

Custom Resource Definitions (CRDs): Define Kafka clusters, topics, users, and more as code, which can be versioned and managed like any other Kubernetes object.

Security: Strimzi supports Kafka's native security features, such as TLS encryption, authentication, and authorization.

Monitoring and Metrics: Integration with Prometheus for monitoring Kafka clusters and exporting metrics for analysis. Why Use Strimzi Operator-Based Kafka?

Ease of Use: Strimzi makes it easier to deploy and manage Kafka clusters on Kubernetes, even for users with limited Kafka expertise.

Consistency: Define the desired state of your Kafka clusters, and the Strimzi Operator ensures that the actual state matches the desired state.

DevOps Friendly: Fits well into a DevOps model, with support for infrastructure as code and automated CI/CD pipelines.

Community Support: Strimzi is an open-source project with a growing community that contributes to its development and provides support.

Operators simplify the process of:

  • Deploying and running Kafka clusters

  • Deploying and running Kafka components

  • Configuring access to Kafka

  • Securing access to Kafka

  • Upgrading Kafka

  • Managing brokers

  • Creating and managing topics

  • Creating and managing users

Strimzi deployment of Kafka

Apache Kafka components are provided for deployment to Kubernetes with the Strimzi distribution. The Kafka components are generally run as clusters for availability.

A typical deployment incorporating Kafka components might include:

  • Kafka cluster of broker nodes

  • ZooKeeper cluster of replicated ZooKeeper instances

  • Kafka Connect cluster for external data connections

  • Kafka MirrorMaker cluster to mirror the Kafka cluster in a secondary cluster

  • Kafka Exporter to extract additional Kafka metrics data for monitoring

  • Kafka Bridge to make HTTP-based requests to the Kafka cluster

  • Cruise Control to rebalance topic partitions across broker nodes

Not all of these components are mandatory, though you need Kafka and ZooKeeper as a minimum.

Kafka component architecture

A Kafka cluster comprises the brokers responsible for message delivery.

ZooKeeper is used for cluster management. When deploying Kafka in KRaft (Kafka Raft metadata) mode, cluster management is simplified by integrating broker and controller roles within Kafka nodes, eliminating the need for ZooKeeper. Kafka nodes take on the roles of brokers, controllers, or both. Roles are configured in Strimzi using node pools.

Each of the other Kafka components interact with the Kafka cluster to perform specific roles.

Kafka component interaction

Alt text

Kafka brokers and topics

Alt text

Broker

A broker orchestrates the storage and passing of messages.

Topic

A topic provides a destination for the storage of data. Each topic is split into one or more partitions.

Cluster

A group of broker instances.

Partition

The number of topic partitions is defined by a topic partition count.

Partition leader

A partition leader handles all producer requests for a topic.

Partition follower

A partition follower replicates the partition data of a partition leader, optionally handling consumer requests.

Topics use a replication factor to configure the number of replicas of each partition within the cluster. A topic comprises at least one partition.

An in-sync replica has the same number of messages as the leader. Configuration defines how many replicas must be in-sync to be able to produce messages, ensuring that a message is committed only after it has been successfully copied to the replica partition. In this way, if the leader fails the message is not lost.

In the Kafka brokers and topics diagram, we can see each numbered partition has a leader and two followers in replicated topics.

Producers and consumers

Alt text

Producer

A producer sends messages to a broker topic to be written to the end offset of a partition. Messages are written to partitions by a producer on a round robin basis, or to a specific partition based on the message key.

Consumer

A consumer subscribes to a topic and reads messages according to topic, partition and offset.

Consumer group

Consumer groups are used to share a typically large data stream generated by multiple producers from a given topic. Consumers are grouped using a group.id, allowing messages to be spread across the members. Consumers within a group do not read data from the same partition, but can receive data from one or more partitions.

Offsets

Offsets describe the position of messages within a partition. Each message in a given partition has a unique offset, which helps identify the position of a consumer within the partition to track the number of records that have been consumed.

Committed offsets are written to an offset commit log. A __consumer_offsets topic stores information on committed offsets, the position of last and next offset, according to consumer group.

Strimzi Operator

Operators are a method of packaging, deploying, and managing Kubernetes applications. They provide a way to extend the Kubernetes API and simplify the administration tasks associated with specific applications.

Strimzi operators support tasks related to a Kafka deployment. Strimzi custom resources provide the deployment configuration. This includes configuration for Kafka clusters, topics, users, and other components. Leveraging custom resource configuration, Strimzi operators create, configure, and manage Kafka components within a Kubernetes environment. Using operators reduces the need for manual intervention and streamlines the process of managing Kafka in a Kubernetes cluster.

Strimzi provides the following operators for managing a Kafka cluster running within a Kubernetes cluster.

Cluster Operator

Deploys and manages Apache Kafka clusters, Kafka Connect, Kafka MirrorMaker, Kafka Bridge, Kafka Exporter, Cruise Control, and the Entity Operator

Entity Operator

Comprises the Topic Operator and User Operator

Topic Operator

Manages Kafka topics

User Operator

Manages Kafka users

The Cluster Operator can deploy the Topic Operator and User Operator as part of an Entity Operator configuration at the same time as a Kafka cluster.

Alt text