Skip to content

CONCEPT Cited by 1 source

Log recovery time (broker disk rebuild)

Definition

When a Kafka broker restarts after an ungraceful shutdown, it must rebuild the local log-index files associated with each partition replica it hosts. This process is called log recovery. Its runtime scales with local disk capacity — on typical ~10TB broker disks, log recovery can take "hours if not days in certain cases" (Kozlovski, Kafka-101).

Why it scales with disk capacity

Kafka's per-replica storage is a sequence of segment files plus index files pointing into those segments (offset → file position, timestamp → offset). On a clean shutdown, indexes are fsync'd and marked consistent; on an ungraceful shutdown, Kafka must walk the tail of every partition's segment files to reconstruct index entries that may not have been persisted. The amount of data to walk scales with how much partition data lives on this broker's local disk.

Why this motivated Tiered Storage

Kozlovski's Kafka-101 structural-walls argument:

"When a broker has close to 10TB of data locally, you start to see issues when things go wrong. One evident problem is handling ungraceful shutdowns — when a broker recovers from an ungraceful shutdown, it has to rebuild all the local log index files associated with its partitions in a process called log recovery. With a 10TB disk, this can take hours if not days in certain cases." (Source: sources/2024-05-09-highscalability-kafka-101)

This is the first of four structural walls that hit when co-locating storage with the broker at scale; the other three are:

All four walls motivate Tiered Storage — offload cold segments to an object store so the local log (and the index structures it anchors) are much smaller, making log recovery fast again.

Seen in

Last updated · 319 distilled / 1,201 read