Skip to content

CONCEPT Cited by 1 source

Clarity over efficiency in protocol design

Definition

Design principle stating that in shared schema / protocol design, the primary objective is to leave no ambiguity for implementers — cross-team, cross-language, cross-generation consumers of the schema. Raw serialisation efficiency and even code brevity are secondary goals; the dominant cost a protocol pays is misinterpretation by some future implementer on a platform or team that wasn't in the room when the schema was written.

From the 2024-09-16 Lyft Media post:

"A well-designed protocol should define its messages in a way where it's not only explicit about which fields must be set. This prevents missetting any of the messages during implementation. In other words, good protocols leave no ambiguity for its implementers."

Why protocols are different from ordinary code

In day-to-day code, readability and efficiency trade off against each other and the engineer can choose the balance per context. Protocols — especially schemas consumed via codegen across many languages and services — have constraints ordinary code doesn't:

  • Breaking changes cost multiplicatively — every client binary that has compiled the old schema keeps running; a breaking change has to be shipped + adopted across all of them.
  • The author is not the only implementer. A message defined today will be read + written by teams you'll never meet on platforms (Swift, Kotlin, Go, JVM, Python, TS, Rust…) with different idioms and constraints.
  • The contract outlives the channel. Messages persisted in logs, queues, caches, CDC pipelines can be re-read years later under schemas their producer no longer recognises.
  • Every ambiguity is code somewhere. If the schema doesn't say whether payload_size is bytes or kilobytes, every consumer has to resolve the ambiguity — by reading docs, by asking around, or (most likely) by guessing wrong.

The corollary: schemas should prefer more bytes + more verbosity when it removes an ambiguity. A 1-byte-larger message that everyone reads the same way is strictly better than a 1-byte-smaller message two teams interpret differently.

Clarity levers in protobuf

Concrete manifestations of the principle, each canonicalised as a separate wiki page:

Each one adds wire bytes, codegen size, or typing effort and each one removes a specific ambiguity the reader would otherwise have to resolve by reading docs or guessing. The principle says the trade is a good one.

When efficiency actually matters more

The principle isn't absolute. Cases where efficiency wins:

  • Hot-path serialisation of fixed-shape internal data — if one team owns producer + consumer, deploys both atomically, and the schema is a performance boundary (e.g. RPC between two internal services on a 10 Gbps link), the extra bytes matter more than the reader-inference risk.
  • Embedded / IoT — where every byte on a constrained radio link costs power.
  • High-fan-out pub/sub where the message is replicated to many consumers and every redundant byte is multiplied.

But these are non-default cases. The Lyft post's principle is that the default assumption for shared schemas is that ambiguity dominates efficiency.

Seen in

Last updated · 319 distilled / 1,201 read