Skip to content

CONCEPT Cited by 1 source

Cache granularity

Definition

Cache granularity is the size of the unit whose inputs form a cache key. At code level it's the function wrapped in a memoisation decorator. At build-system level it's the action with declared srcs/outs. At CDN level it's the file, URL, or cache variant. Smaller units (fewer inputs per key) → higher hit rate for a fixed change pattern; larger units (more inputs per key) → lower hit rate.

The "100 parameters, 2-3 always change" failure mode

The canonical articulation comes from Slack's Quip/Canvas build retrospective:

Our build was so interconnected that our cache hit rate was zero. Imagine that every cached "function" we tried to call had 100 parameters, 2-3 of which always changed.

— Slack, Build better software to build software better

Two structural problems compound into this state:

  1. Granularity too coarse: the cached unit of work takes more inputs than it needs, so small changes invalidate large caches.
  2. Transitive input leakage: inputs that shouldn't belong in the cache key sneak in via indirect dependencies — in Slack's case, every Python source file was a transitive input to every frontend bundle, so any Python change invalidated every bundle.

When hit rate is structurally zero (the cache key cannot stay stable across commits), neither cache size nor eviction policy can help. The fix is to shrink the key — either by splitting the unit of work finer (fewer inputs) or by cutting transitive edges (fewer inputs leak in).

Worked example: image processing at code level

The canonical code-level example from the Slack post:

# Too coarse — whole list of images + transforms is the cache key.
@functools.cache
def process_images(images, transforms): ...

Any added image or transform invalidates the entire memoised result. Refactored:

# Fine-grained — one image-transform pair per cache key.
def process_images(images, transforms):
    return [
        reduce(process_image, transforms, img) for img in images
    ]

@functools.cache
def process_image(image, transform): ...

The higher-level API is preserved, but caching is done at the leaf unit. Only new (image, transform) pairs miss the cache; everything else is served from it. Exactly the same principle applies to Bazel targets (see systems/bazel and concepts/build-graph).

Worked example: build system level

Slack's frontend bundler originally took "all TypeScript sources plus all CSS/LESS sources" in and produced "all deployable bundles" out — one Bazel action, one cache entry, many inputs. A single .ts file change invalidated every bundle's cache.

The refactor made each bundle its own action, with TypeScript and CSS compiled independently in parallel sub-actions. Bazel can now (a) cache each bundle's TypeScript output separately from its CSS output, (b) parallelise all bundle builds across workers, and (c) recompute only the bundles whose direct inputs changed.

This is the same code-level principle at a different altitude: smaller keys → higher hit rate → faster builds.

The granularity trade-off

Finer granularity has costs:

  • Orchestration overhead: each cached unit has book-keeping (input hashing, cache lookup, cache-miss execution). Past some point, the overhead per unit exceeds the saved work.
  • API complexity: decomposing a coarse API into fine-grained ones may burden callers with more boilerplate.
  • Cache storage: finer units produce more cache entries, which costs storage and lookup time.

The right granularity is the smallest size at which orchestration overhead is justified by the change-pattern savings. In practice this often means: granular enough that the typical change touches only a small fraction of cached units.

Anti-patterns

  • Caching at the orchestrator layer: wrapping a cache around a coarse API (e.g. process_images above) gives near-zero hit rate when inputs change often.
  • Hidden transitive inputs: a cache key that looks small on the surface but has implicit dependencies (e.g. a build action whose srcs includes a directory glob that captures unrelated files). The Slack Python→TypeScript coupling is this pattern at build- system altitude.
  • Claiming "caching doesn't help": before concluding a workload is uncacheable, verify that the cache granularity matches the change pattern. A different decomposition often turns a zero-hit-rate workload into a high-hit-rate one.

Seen in

  • sources/2025-11-06-slack-build-better-software-to-build-software-better — canonical articulation of the granularity lever at both code and build-system altitudes. Slack's process_image vs process_images example is the canonical code-level illustration; the frontend- bundler refactor (from all-bundles-one-action to one-bundle-one- action) is the build-system analogue.
Last updated · 470 distilled / 1,213 read