Day 2 - Kafka Brokers, Clusters, and Retention

Learning about Kafka brokers, cluster architecture, leaders and followers, and message retention.
Author

Shashank Hosahalli Shivamurthy

Published

July 8, 2025

Day 2 - Kafka Brokers, Clusters, and Retention

Continuing from where I left off yesterday, diving deeper into Chapter 1 of Kafka: The Definitive Guide.

Brokers

  • A single instance of a Kafka server is called a broker.
  • The broker handles:
    • Receiving messages from producers
    • Assigning offsets to messages
    • Writing messages to disk
    • Responding to consumer requests for partitions
  • Each broker manages multiple partitions.
  • Kafka brokers operate as part of a cluster by default.

Cluster

  • A cluster is a group of Kafka brokers working together.
  • One broker in the cluster acts as the controller.
    • The controller is automatically elected from live brokers.
    • Responsibilities include:
      • Assigning partitions to brokers
      • Monitoring broker failures
  • Before Kafka 4.0:
    • Apache ZooKeeper was used to manage metadata and broker coordination.
  • Now (in KRaft mode):
    • Kafka brokers self-manage metadata
    • This is done using Raft — a consensus algorithm for distributed state replication.

Leaders and Followers

  • Each partition has a leader — a single broker responsible for that partition.
  • Additional brokers are assigned as followers of that partition.
  • Replication enables:
    • Redundancy of data
    • Failover support — followers can take over as leader if needed
  • Producers:
    • Must send data only to the leader of the partition
  • Consumers:
    • Can read from either the leader or the followers

Data Retention

  • Kafka’s key feature is durable retention of messages.
  • Brokers are configured with default retention policies:
    • Retention based on time (e.g., 7 days)
    • Retention based on size (e.g., 1 GB per partition)
  • Once retention limits are hit:
    • Old messages are deleted
  • Individual topics can override global retention settings.
  • Special case: Log Compaction
    • Kafka retains only the latest message for each unique key.
    • Useful for changelog or stateful updates.