Skip to content

PATTERN Cited by 1 source

Zero-copy protobuf decoding

Definition

Zero-copy protobuf decoding is the pattern of parsing Protocol Buffer messages by traversing the wire-format bytes in a single pass without allocating intermediate memory objects or copying data out of the network buffer. It combines the flexibility of runtime reflection (dynamic descriptors, no compile-time codegen required) with the performance of generated code (zero allocations, no object-graph construction).

The problem it solves

Standard protobuf decoders force a choice:

Approach Pros Cons
Code generation (codegen) Fast, zero-overhead at runtime Requires descriptors at compile time; cannot handle arbitrary user schemas at runtime
Runtime reflection Fully dynamic, accepts any schema Slow — builds object graph in memory, many small allocations

For services that receive arbitrary user-defined schemas at runtime (e.g., a managed ingestion service accepting any producer's data format), codegen is impossible and reflection is too slow at high throughput.

Mechanism (Zeroparser instantiation)

Databricks' Zeroparser bridges this gap:

  1. Single-pass parsing: traverses wire bytes exactly once.
  2. Zero memory allocations: no intermediate objects; fields are referenced directly into the network-owned buffer.
  3. Rust lifetime system: compile-time guarantee that raw wire bytes remain under exclusive network ownership during parsing — safety without runtime overhead.
  4. Dynamic descriptor support: schemas provided at runtime, yet performance matches or exceeds codegen.

Result: ~1 GB/s protobuf parsing per CPU core with complex schemas (NEOWISE benchmark: nested fields, repeated fields, mixed types).

Trade-offs

Advantage Cost
Codegen-level throughput with runtime flexibility Implementation complexity (requires language with ownership semantics like Rust)
Zero allocation pressure / GC-free Parsed data only valid while network buffer is live
Single-pass, cache-friendly Cannot do random-access field lookup during parse

Seen in

Last updated · 542 distilled / 1,571 read