CONCEPT Cited by 1 source
Clarity over efficiency in protocol design¶
Definition¶
Design principle stating that in shared schema / protocol design, the primary objective is to leave no ambiguity for implementers — cross-team, cross-language, cross-generation consumers of the schema. Raw serialisation efficiency and even code brevity are secondary goals; the dominant cost a protocol pays is misinterpretation by some future implementer on a platform or team that wasn't in the room when the schema was written.
From the 2024-09-16 Lyft Media post:
"A well-designed protocol should define its messages in a way where it's not only explicit about which fields must be set. This prevents missetting any of the messages during implementation. In other words, good protocols leave no ambiguity for its implementers."
Why protocols are different from ordinary code¶
In day-to-day code, readability and efficiency trade off against each other and the engineer can choose the balance per context. Protocols — especially schemas consumed via codegen across many languages and services — have constraints ordinary code doesn't:
- Breaking changes cost multiplicatively — every client binary that has compiled the old schema keeps running; a breaking change has to be shipped + adopted across all of them.
- The author is not the only implementer. A message defined today will be read + written by teams you'll never meet on platforms (Swift, Kotlin, Go, JVM, Python, TS, Rust…) with different idioms and constraints.
- The contract outlives the channel. Messages persisted in logs, queues, caches, CDC pipelines can be re-read years later under schemas their producer no longer recognises.
- Every ambiguity is code somewhere. If the schema doesn't say
whether
payload_sizeis bytes or kilobytes, every consumer has to resolve the ambiguity — by reading docs, by asking around, or (most likely) by guessing wrong.
The corollary: schemas should prefer more bytes + more verbosity when it removes an ambiguity. A 1-byte-larger message that everyone reads the same way is strictly better than a 1-byte-smaller message two teams interpret differently.
Clarity levers in protobuf¶
Concrete manifestations of the principle, each canonicalised as a separate wiki page:
- Explicit discriminators. Use
oneof(tagged union) rather thanenum kind + optional fields(implicit correctness contract). See patterns/oneof-over-enum-plus-field. - Explicit semantics in names.
payload_size_bytes, notpayload_size. See concepts/unit-suffix-field-naming. - Explicit sentinels. Reserve
0forUNKNOWN/UNSPECIFIEDon every enum. See concepts/unknown-zero-enum-value. - Explicit presence.
optionallabel (proto3 ≥ 3.15) or wrapper types to distinguish absent from default on scalars. See concepts/proto3-explicit-optional. - Explicit constraints. Declare validation rules inline with protoc-gen-validate / protovalidate.
- Well-known types over primitives.
google.protobuf.Timestampoveruint64; unit and timezone are now enforced by the type system.
Each one adds wire bytes, codegen size, or typing effort and each one removes a specific ambiguity the reader would otherwise have to resolve by reading docs or guessing. The principle says the trade is a good one.
When efficiency actually matters more¶
The principle isn't absolute. Cases where efficiency wins:
- Hot-path serialisation of fixed-shape internal data — if one team owns producer + consumer, deploys both atomically, and the schema is a performance boundary (e.g. RPC between two internal services on a 10 Gbps link), the extra bytes matter more than the reader-inference risk.
- Embedded / IoT — where every byte on a constrained radio link costs power.
- High-fan-out pub/sub where the message is replicated to many consumers and every redundant byte is multiplied.
But these are non-default cases. The Lyft post's principle is that the default assumption for shared schemas is that ambiguity dominates efficiency.
Seen in¶
- sources/2024-09-16-lyft-protocol-buffer-design-principles-and-practices — canonical statement on the wiki. Lyft Media's post names "clarity" as the first of two principles for protobuf protocol design (with extensibility as the second). The rest of the post is a catalogue of concrete clarity levers — oneof, well-known types, optional label, validation rules, cross-entity constants — each with a before/after.
Related¶
- concepts/extensibility-protocol-design — sibling principle
- concepts/backward-compatibility — the property clarity defends
- concepts/contract-first-design — design method this principle frames
- systems/protobuf — the schema system the principle is grounded in
- concepts/unit-suffix-field-naming / concepts/unknown-zero-enum-value / concepts/proto3-explicit-optional — concrete manifestations
- patterns/oneof-over-enum-plus-field / patterns/protobuf-validation-rules — patterns this principle justifies