Skip to content

PATTERN Cited by 1 source

Graceful upgrade via monoversion decoder

Problem

Compression configs evolve — new Plans beat old ones as data drifts, new transforms get added to the library, schemas shift. Traditional approaches require producer + consumer coordination on format versions: the frame header declares a version, decoders branch on it, and upgrading the format means shipping a new decoder binary and waiting for it to roll out across the entire consumer fleet before any producer can adopt it.

For Meta's data-center use cases, that coordination is prohibitive.

Solution

Keep the decoder binary version-stable across config evolution. New Plans produce frames that decode correctly with the same decoder binary that decodes old frames, because the frame carries its resolved graph expressed in transform primitives that are already in the decoder's library.

Two ingredients:

  1. Universal decoder — one binary executes any Resolved Graph it's given.
  2. Embedded decode recipe — frames are self-contained.

Together they mean: old Plan frames ↔ any decoder. New Plan frames ↔ any decoder. Plan upgrades don't need decoder upgrades.

Canonical instance

OpenZL (Meta, 2025):

"Train a plan offline, try it on a small slice, then roll it out like any other config change. Backward compatibility is built-in — old frames still decode while new frames get better." (Source: sources/2025-10-06-meta-openzl-an-open-source-format-aware-compression-framework.)

And:

"A decoder update (security or performance — SIMD kernels, memory bounds, scheduling) benefits every compressed file, even those that predate the change."

Reversed direction: a new decoder binary benefits every frame, including old ones. Decoder improvements (SIMD kernels, fuzz-found bounds fixes, scheduler improvements) retroactively improve the entire historical corpus.

Operational properties this gives you

Described in the post:

  • Roll out new Plans without waiting on consumer fleets. Plan rollout is a config change in Managed Compression; consumers transparently decode the new frames.
  • Security + performance decoder changes pay back across history. A SIMD optimization written today makes every frame ever produced faster.
  • Patching + rollout are uneventful by design. Same binary, same CLI, same metrics and dashboards across datasets and Plans.
  • Continuous training. With one decoder + many Plans, Meta can keep improving compression while the system is live.

Contrast: version-branched decoders

The alternative most formats use:

frame_header { version: u16, … }

decoder(frame) {
    match frame.version {
        1 => decode_v1(frame),
        2 => decode_v2(frame),
        3 => decode_v3(frame),
    }
}

Each new version ships a new decoder code path. Old frames stay decodable only as long as the matching old code path is kept around. Adding a new version requires a new decoder release.

The monoversion-decoder pattern doesn't have this — there's no match version branch at the top. The frame says "here is the DAG of transforms to run in reverse," and the decoder runs it.

Preconditions

This pattern works when:

  • The decoder's primitive library is expressive enough that new Plans can be expressed as compositions of existing primitives. When that's false (a genuinely new transform is needed), the decoder binary does have to be updated — but that's a capability add, not a per-format add.
  • Safety limits are decoder-enforced, not frame-declared. Trust boundaries live in the binary, not in the data. The post is explicit: "The single decoder checks it, enforces limits, and runs the steps in order."

Seen in

Last updated · 319 distilled / 1,201 read