Skip to content

PATTERN Cited by 1 source

Tiered state management — Memcache plus DB

Pattern

When a workflow or batch process produces both short-lived intermediate state (live for the duration of one job) and long-lived persistent state (live across many job runs), use two distinct storage tiers:

  • Distributed in-memory cache (Memcached / Redis) for the intra-job multi-step shared state — fast, ephemeral, evictable; no durability obligation.
  • Relational database for the persistent pre-computed reports — durable, queryable, long-retention.

The 2026-05-14 Atlassian post is the first canonical wiki home for this pattern in the pre-computation-framework context. Verbatim:

"Optimisation tools need to store pre-computed 'usage' data for many Jira entity types (fields, options, roles, schemes, etc.) at scale. This data is: Large (up to ~1M records per entity type, per tenant), Read-heavy (queried often in reports and UIs), Refreshed in batch (monthly/periodically, not in real time). Naively adding one dedicated table per entity doesn't scale ... So we used two layers: For short-lived, multi-step sharing inside a job are memcached. For persistent pre-computation, we store data in the Jira relational database using a small set of generic tables (polymorphic 'usage' tables) instead of one table per entity type."

(Source: sources/2026-05-14-atlassian-optimisation-tools-for-jira-reducing-configuration-bloat)

Problem

A pre-computation framework processing tenant data in batches needs storage for two distinct kinds of state with opposing requirements:

  • Intermediate state that scan-step workers share during a single job's lifecycle (Initialise → Scan-steps → Finalise). Examples: per-batch contributions to a running aggregate, deduplication bitmaps, in-flight scope partitions. Required: fast read/write, multi-worker visibility. Not required: durability past job completion.
  • Persistent reports that downstream consumers (admin UIs, remediation tools) read after the job completes, potentially many times before the next refresh. Required: durability, indexable querying, long retention. Not required: extreme write throughput (refreshes are batch / periodic).

Conflating the two — putting intermediate state in the DB, or report state in cache — costs:

  • Intermediate state in DB: high write throughput during scan-steps amplifies row-count and write-pressure on the DB unnecessarily; transient state pollutes the durable schema.
  • Report state in cache: report durability is lost on cache eviction; cache size becomes proportional to report retention rather than working-set size.

Solution

Split state into the two tiers along the durability boundary:

function scan_step(scope_id, batch) {
  // Intra-job state → Memcache (fast, ephemeral)
  for entity in batch {
    contribution = compute_contribution(entity, scope_id)
    memcache.upsert_merge(scope_id, contribution)
  }
}

function finalise(scope_id) {
  // Aggregate from Memcache, persist to DB
  intermediate = memcache.read_all(scope_id)
  report = build_report(intermediate)
  // Persistent state → DB (durable, queryable)
  db.upsert_polymorphic_usage(scope_id, report)
  // Optionally clear intermediate state
  memcache.expire(scope_id)
}

The Memcache tier holds state during the workflow; the DB tier holds state across workflows. The Finalise phase is the commit point that promotes (transformed) intermediate state to persistent state.

What goes where

State Tier Lifetime
Per-batch in-flight contributions Memcache Single job
Per-scope intermediate aggregates Memcache Single job
Deduplication bitmaps for the job Memcache Single job
Final per-scope reports DB Until next refresh
Last-refresh timestamps DB Permanent
Aggregate metrics for monitoring DB Permanent (or archived)

The discriminator: does anything outside this job need to read the data after the job finishes? If yes, DB. If no, Memcache.

Trade-offs

Property Tiered (Memcache + DB) DB-only Cache-only
Intra-job read/write latency Fast (Memcache) Slow (DB round-trips) Fast
Persistent report durability Yes (DB) Yes No (eviction loss)
DB write pressure Low (only Finalise commits) High (every scan-step writes) None
Cache eviction risk Bounded (state lives for one job only) None High (any eviction loses report)
Operational complexity Two systems to monitor One One (but unsuitable)
Cost per read of report Low (DB index scan) Low Low if cached, otherwise lost

Implementation discipline

  • Memcache is a soft commitment. Eviction during a job is a possibility. Either tolerate the cost (job restart with idempotent scan-steps; see concepts/idempotent-thread-safe-scan-step) or size Memcache to comfortably fit the working set.
  • Memcache keys must be scoped to the job. Use (scope_id, job_id) as the key prefix so concurrent jobs on the same scope don't collide.
  • TTL the Memcache state past job completion. Even after Finalise, keep the intermediate state for some buffer (minutes to hours) to support inspection / debugging if Finalise itself fails.
  • DB schema chosen for read-heavy report query patterns. See patterns/polymorphic-usage-tables-for-multi-tenant-scale for the multi-tenant DB-design decision Atlassian pairs with this tiering.

Adjacent patterns

Adjacent at other altitudes

  • CPU cache + main memory — same boundary (fast/ephemeral vs slow/persistent) at the hardware altitude.
  • Redis + Postgres — the most common SaaS instance of this pattern; Redis Tower as session/cache, Postgres as source of truth. Memcached fills the same role for Atlassian.
  • OS page cache + disk — kernel-level instance of the same idea.
  • Cloudflare KV (eventual) + R2 / D1 (durable) — edge- altitude instance.

Seen in

Last updated · 542 distilled / 1,571 read