Day 2 - Kafka Brokers, Clusters, and Retention
Learning about Kafka brokers, cluster architecture, leaders and followers, and message retention.
Day 2 - Kafka Brokers, Clusters, and Retention
Continuing from where I left off yesterday, diving deeper into Chapter 1 of Kafka: The Definitive Guide.
Brokers
- A single instance of a Kafka server is called a broker.
- The broker handles:
- Receiving messages from producers
- Assigning offsets to messages
- Writing messages to disk
- Responding to consumer requests for partitions
- Each broker manages multiple partitions.
- Kafka brokers operate as part of a cluster by default.
Cluster
- A cluster is a group of Kafka brokers working together.
- One broker in the cluster acts as the controller.
- The controller is automatically elected from live brokers.
- Responsibilities include:
- Assigning partitions to brokers
- Monitoring broker failures
- Before Kafka 4.0:
- Apache ZooKeeper was used to manage metadata and broker coordination.
- Now (in KRaft mode):
- Kafka brokers self-manage metadata
- This is done using Raft — a consensus algorithm for distributed state replication.
Leaders and Followers
- Each partition has a leader — a single broker responsible for that partition.
- Additional brokers are assigned as followers of that partition.
- Replication enables:
- Redundancy of data
- Failover support — followers can take over as leader if needed
- Producers:
- Must send data only to the leader of the partition
- Consumers:
- Can read from either the leader or the followers
Data Retention
- Kafka’s key feature is durable retention of messages.
- Brokers are configured with default retention policies:
- Retention based on time (e.g., 7 days)
- Retention based on size (e.g., 1 GB per partition)
- Once retention limits are hit:
- Old messages are deleted
- Individual topics can override global retention settings.
- Special case: Log Compaction
- Kafka retains only the latest message for each unique key.
- Useful for changelog or stateful updates.