Skip to content

PATTERN Cited by 1 source

Deterministic task caching

Description

Cache the outputs of workflow tasks keyed on their parameters + input data fingerprints. When a subsequent run invokes a task with identical inputs, return the cached result without re-execution. This is the ML-workflow analog of build-system caching (Bazel's action cache, Buck's rule keys).

Mechanics

  1. Before executing a task, compute a cache key from: module version + parameters + hash of input data
  2. Look up the cache key in the cache store
  3. If hit: return stored outputs, skip execution
  4. If miss: execute the task, store outputs keyed by the computed key

Applicability

Most effective when: - Workflows are long-running with many independent stages - Users iterate on late-stage components while early stages are stable - Tasks are deterministic (same inputs → same outputs)

Less effective for: - Tasks with side effects (external API calls, non-deterministic randomness) - Tasks where input data changes frequently - Cases where cache invalidation correctness is hard to guarantee

Seen In

(Source: sources/2026-06-10-atlassian-architecting-scalable-ml-platforms)

Last updated · 542 distilled / 1,571 read