Skip to content

CONCEPT Cited by 1 source

Segmented Changelog (algorithm)

Segmented Changelog is an algorithmic representation of a VCS commit graph that partitions commits into maximal linear segments at merge points and stores the inter-segment merge structure as a compact index. The representation enables fast ancestry and log queries without iterating over all commits.

Originated in Meta's Sapling source control system; canonical wiki reference: systems/meta-segmented-changelog.

Core properties

Graph shape is cheap. Per the 2022-11-15 Sapling post, the graph-shape index downloads in "just a few megabytes" regardless of the repo's commit count.

Ancestry query is O(number-of-merges). With only the segment index and two commit positions:

"This enables querying the graph relationship between any two commits in O(number-of-merges) time, with nothing but the segments and the position of the two commits in the segments."

— Sapling announcement post

Log / blame is O(log n) via bisection. Segmented Changelog bisects the segment graph for file-history queries — see concepts/commit-graph-bisection.

Why segments

The classical graph representation stores every commit with edges to its parent(s) and treats the whole thing uniformly. For most real-world repos, the commit graph is sparse in merges — long linear chains punctuated by occasional merges. Segmented Changelog exploits this: each maximal linear run is a single "segment" in the index; only merges create new segment boundaries. Queries become operations on the (much smaller) segment graph.

Complexity: O(number-of-merges) is a dramatic improvement over O(number-of-commits) when n_merges ≪ n_commits, which is the typical monorepo case.

Pairing with lazy history

Segmented Changelog alone doesn't give you scale — you still need the per-commit data (metadata, trees, files) to do most operations. The pairing is lazy history download: download the segment index up front (cheap), fetch per-commit data on demand as queries need it. Together they enable monorepo-scale VCS without the upfront cost of downloading everything.

Seen in

Last updated · 319 distilled / 1,201 read