CONCEPT Cited by 1 source
Kafka partition¶
Definition¶
A Kafka partition is the unit of parallelism, ordering, and replication in Apache Kafka. Every topic is split into N partitions; each partition is an ordered, append-only log (concepts/distributed-log); each partition is replicated R times (replication factor) into R replicas across different brokers. Records within a partition are ordered by a monotonic offset; records across partitions are not ordered with respect to each other.
Physical layout — replica is the storage unit¶
Kozlovski's Kafka-101 framing:
"A simple analogy is that just how the basic storage unit in an operating system is a file, the basic storage unit in Kafka is a replica (of a partition). Each replica is nothing more than a few files itself, each of which embody the log data structure and sequentially form a larger log. Each record in the log is denoted by a specific offset, which is simply a monotonically-increasing number." (Source: sources/2024-05-09-highscalability-kafka-101)
So for a topic with partition count P and replication factor R,
Kafka stores P × R replicas across the broker fleet, each replica
is a sequence of segment files on one broker's disk, and each
record is addressed by (topic, partition, offset).
What the partition primitive buys¶
- Parallelism — producers distribute records across partitions (by key-hash or round-robin); consumers in a group divide partitions among themselves. Throughput scales with partition count.
- Ordering semantics — ordering is per-partition only. Total ordering across a topic requires partition count 1; any horizontal scale requires giving up that ordering. This is the core trade-off.
- Replication unit — replication is scoped to the partition, not to the topic — different partitions of the same topic can have different leaders on different brokers.
Record-key → partition assignment¶
Keyed records hash to a partition index (hash(key) % partition_count),
so records with the same key land on the same partition. This is
the foundation that sub-topology-scoped colocation in
Kafka Streams is built on (see
sources/2025-11-11-expedia-kafka-streams-sub-topology-partition-colocation).
Seen in¶
- sources/2024-05-09-highscalability-kafka-101 — canonical statement of partition = parallelism-unit + ordering-unit + replication-unit + storage-unit.
- sources/2025-11-11-expedia-kafka-streams-sub-topology-partition-colocation — partition-level cross-topic colocation consequences.
Related¶
- systems/kafka
- concepts/distributed-log — each partition is a log.
- concepts/in-sync-replica-set — the per-partition set of replicas that are caught up with the leader.
- concepts/leader-follower-replication — the partition-level replication shape.
- concepts/consumer-group — the consumer-side primitive that divides partitions among members.
- patterns/leader-based-partition-replication — the architectural pattern partitions participate in.