PATTERN Cited by 1 source
Deterministic task caching¶
Description¶
Cache the outputs of workflow tasks keyed on their parameters + input data fingerprints. When a subsequent run invokes a task with identical inputs, return the cached result without re-execution. This is the ML-workflow analog of build-system caching (Bazel's action cache, Buck's rule keys).
Mechanics¶
- Before executing a task, compute a cache key from: module version + parameters + hash of input data
- Look up the cache key in the cache store
- If hit: return stored outputs, skip execution
- If miss: execute the task, store outputs keyed by the computed key
Applicability¶
Most effective when: - Workflows are long-running with many independent stages - Users iterate on late-stage components while early stages are stable - Tasks are deterministic (same inputs → same outputs)
Less effective for: - Tasks with side effects (external API calls, non-deterministic randomness) - Tasks where input data changes frequently - Cases where cache invalidation correctness is hard to guarantee
Seen In¶
- sources/2026-06-10-atlassian-architecting-scalable-ml-platforms — ML Studio: ~80% of workflows leverage caching daily; 1,000+ hours of execution time saved per month; "developers only re-run what's changed"
(Source: sources/2026-06-10-atlassian-architecting-scalable-ml-platforms)