Skip to content

CONCEPT Cited by 1 source

Computed Pattern

Definition

Computed Pattern — MongoDB's named schema-design pattern where values that would otherwise be computed at read time (sums, counts, averages, rollups) are pre-aggregated at write time and stored directly in the document. Reads become field lookups; writes become $inc / $set operations against the pre-aggregated fields.

Shifts work from the read path to the write path. Justified when the ratio of reads to writes is high enough that the write-path cost of maintaining the pre-aggregate is dominated by the read-path savings.

Canonical form (with Bucket Pattern)

In MongoDB's Cost of Not Knowing event-counter case study, the Computed Pattern is applied on top of the concepts/bucket-pattern: each bucket stores pre-aggregated per-day status totals rather than raw events.

// appV5R3 document (Part 2 winner — quarter-bucketed, per-day computed):
{
  _id: <key+year+quarter>,
  items: [
    { date: 2022-06-05, a: 10, n: 3 },  // a=approved, n=noFunds,
    { date: 2022-06-16, p: 1, r: 1 },   // p=pending,  r=rejected
    { date: 2022-06-27, a: 5, r: 1 },
    ...
  ]
}

A new event is not appended as a raw row; the existing items entry for its date is incremented via $inc. If the entry doesn't exist, $inc creates it (treating missing fields as zero). The upshot is that by the time a report runs, the sums already exist — the aggregation pipeline just filters + sums per-bucket totals, never touches per-event raw data.

Trade-offs

  • Write amplification. Every event triggers an upsert + $inc against the bucket document. Concurrent writes to the same bucket serialize on the document-level WiredTiger lock.
  • Field cardinality. The Computed Pattern commits to the set of aggregates at schema design time (a, n, p, r in the case study). Adding a new status category (e.g. x) means a schema change.
  • Double-counting risk on replays. If an event processor retries a batch, naïve $inc double-counts. Deduplication responsibility shifts to the application (idempotency keys, client-side de-duplication before increment, or a "seen events" collection).
  • Lossy for individual-event queries. The raw {date, amount, user, metadata} tuple is not stored; you can't answer "which event caused this increment?" from the computed document alone.

Relationship to other MongoDB schema patterns

  • concepts/bucket-pattern — natural companion. Bucket groups events into windows; Computed pre-aggregates inside the bucket.
  • patterns/dynamic-schema-field-name-encoding — further shrinks the Computed structure: instead of items: [{ date, a, n, p, r }], store items: { "0605": {a, n, p, r}, "0616": ... } using the date as a field name. Same pre-aggregation, smaller on-disk form.
  • Materialized views / $merge aggregation. Periodic computed outputs written to a separate collection are a batch version of the same idea; the Computed Pattern is the online, per-write version.

Seen in

  • sources/2025-10-09-mongodb-cost-of-not-knowing-mongodb-part-3-appv6r0-to-appv6r4 — load-bearing across the entire Part-2 appV5RX family (which Part 3 builds on). Part 2's best result (appV5R3) combined Bucket (quarter-bucketed, _id = key + year + quarter) + Computed (per-day {a, n, p, r} status totals pre-aggregated at write time via $inc) for 33 M documents, 385 B avg document, 11.96 GB data, 1.11 GB index on 500 M input events. Part 3 inherits this and reshapes only the inner structure to a dynamic schema.
Last updated · 200 distilled / 1,178 read