Skip to content

CONCEPT Cited by 1 source

Sparse checkout

Sparse checkout is a VCS feature that materializes only a declared subset of the repository's paths to the working copy on disk. The full commit graph and history are still present; the checkout just doesn't expand every file into the filesystem.

Available in Git since 2010 (git sparse-checkout), Mercurial via hg sparse, and as a first-class primitive in Sapling.

Why it matters in a monorepo

At monorepo scale (millions of files), a full checkout is prohibitively expensive in both disk space and filesystem overhead:

  • Disk: terabytes of unused files.
  • find / grep / git status: scan the whole tree.
  • IDE index: load the entire codebase into memory.

Sparse checkout lets an engineer work on "their" subset — typically ~1% of a big monorepo — while the rest of the repo exists only at the index level.

Sapling's treatment

Per the 2022-11-15 Sapling announcement:

"Without the virtual file system… we have special support for sparse checkouts to allow checking out only part of the repository."

— Sapling announcement post

Sapling pairs sparse checkout with sparse profiles — checked-in, organization-owned named configurations — which is the architectural move that makes sparse checkout operationally viable for thousands of engineers (see patterns/organization-owned-sparse-profile).

Relationship to VFS

A virtual file system is the heavier-weight alternative: present the full repo shape but fetch files lazily on access. Sparse checkout is the simpler, no-kernel-extension alternative: narrow the declared shape so the materialized footprint is small.

Sapling deploys the VFS where available and sparse checkout as the fallback — they solve overlapping but distinct problems.

Seen in

Last updated · 319 distilled / 1,201 read