Skip to content

PATTERN Cited by 1 source

Embedded decode recipe in frame

Problem

Format-aware compression produces per-use-case configurations (Plans, dictionaries, graphs). If the decoder needs out-of-band knowledge of which config produced a given frame, three things break:

  1. Frames stop being self-contained. Consumers need a catalog / config-ID → config-body lookup.
  2. Version skew is everyone's problem. Producers and consumers have to negotiate config versioning.
  3. One binary ↔ one config tightly couples. A specialized decompressor-per-format becomes the rational choice.

Solution

Embed the resolved decode recipe in the frame itself. Each frame carries enough structural data — an enumeration of the transforms used + their parameters — for the decoder to execute the inverse sequence without any lookup.

In OpenZL (Source: sources/2025-10-06-meta-openzl-an-open-source-format-aware-compression-framework):

"While compressing, the encoder turns the Plan into a concrete recipe — the Resolved Graph. If the Plan has control points, it picks the branch that fits the data and records that choice into the frame. Each frame chunk carries its own resolved graph. The single decoder checks it, enforces limits, and runs the steps in order."

The Plan is the configuration object used to produce frames; the Resolved Graph is what actually ships in the frame. Decoders only see Resolved Graphs — they're Plan-agnostic.

What this pattern buys

  • Self-contained frames. A frame is sufficient to decode itself. Useful for long-term archives, cross-organization transfer, any scenario where the producer's config store isn't reachable by the consumer.
  • Universal decoder. The decoder implements a fixed library of inverse transforms; frames specify how to compose them. No per-Plan decoder.
  • Rollout decoupling. A new Plan can produce frames that decode immediately on any current-binary decoder, because the Resolved Graph is expressed in transforms the decoder already knows.
  • Security posture. Safety limits live in the decoder, not the frame. Corrupted or malicious frames can't escape the decoder's resource bounds — "the single decoder checks it, enforces limits" (Source: sources/2025-10-06-meta-openzl-an-open-source-format-aware-compression-framework).

Costs

  • Per-frame overhead. The Resolved Graph has to fit in the frame; implies a per-frame header cost. OpenZL's post doesn't disclose the size, but the implication is that it's small relative to the compressed payload and doesn't dominate for typical frame sizes.
  • The decoder's primitive set has to be a superset of everything any Plan ever uses. Adding a new transform requires a decoder update. This is a one-time cost per new primitive, not a per-format cost.

Canonical instances

  • OpenZL (Meta, 2025) — canonical wiki instance. Resolved Graph embedded per frame chunk.

Contrast: version-ID + out-of-band config

Many systems ship a config-ID (a short identifier) in the frame, and expect the decoder to look up the config body out of band (a distributed config service, a sidecar file, a negotiated value). That isn't this pattern — it saves frame bytes but re-introduces the "frame ↔ config version skew" problem this pattern was designed to avoid. Meta's OpenZL architecture chose the self-contained variant.

Seen in

Last updated · 319 distilled / 1,201 read