SYSTEM Cited by 1 source
Protocol Buffers (protobuf)¶
Protocol Buffers (protobuf.dev) is Google's
open-source, language-neutral, schema-driven binary serialization
format and the default IDL for gRPC. A .proto file
declares messages, enums, and service RPCs; the protoc compiler
generates typed code for C++ / Java / Python / Go / C# / Swift /
Kotlin / JavaScript / etc. Messages serialise to a compact
tag-length-value binary format, and the schema is designed for
forwards-and-backwards-compatible evolution — unknown fields are
preserved on deserialise/re-serialise, field numbers (not names) are
load-bearing on the wire, and the default-value semantics are chosen
to tolerate added and removed fields.
Key language features (proto3)¶
message— record type with numbered fields. Field number, not name, is what appears on the wire.enum— named integer constants.0must be present and is the default value for enum-typed fields; canonical convention is to reserve0asUNKNOWN(see concepts/unknown-zero-enum-value).oneof— tagged union; exactly one of the contained fields may be set at a time. The wire encoding only includes the set branch. See patterns/oneof-over-enum-plus-field.optional(reintroduced in proto3 3.15, 2021) — on singular scalars, addsHasField()presence semantics so callers can distinguish "absent" from "equals type default". Before 3.15, the workaround was thegoogle.protobuf.*Valuewrapper types (StringValue,UInt32Value, …). See concepts/proto3-explicit-optional.map<K, V>— key/value collection; equivalent to arepeatedmessage of{K, V}pairs on the wire.- Well-known types —
google.protobuf.Timestamp,Duration,Any, the*Valuewrappers; standardised imports every codegen supports.
Why protobuf (vs JSON, vs Cap'n Proto, vs Avro)¶
From the 2024-09-16 Lyft Media post:
- Efficiency — denser on wire than JSON, faster encode/decode.
- Forward/backward compatibility — tag-based encoding tolerates schema drift by design.
- Multi-language codegen — Python / Swift / Kotlin / Java / Go / C++ / TS all first-class.
- Rich tooling — Lyft specifically calls out "rich internal tooling at Lyft, and widespread use in both mobile-to-server and server-to-server domains."
- Validation extensions — declarative validation via
protoc-gen-validate / protovalidate
over the same
.protosurface.
Contrast:
- JSON — human-readable, ubiquitous, but larger wire size, weaker typing, and no first-class schema evolution story.
- Cap'n Proto — zero-copy memory representation ("the in-memory layout is the wire"); protobuf encodes/decodes which costs CPU but gives flexibility. Cap'n Proto drops its schema language in the JS-native Cap'n Web successor.
- Avro — schema stored with the data or in a registry, names on the wire, more dynamic; protobuf uses static codegen and field numbers.
Compatibility rules (must-know)¶
- Never reuse a field number. Reserve numbers / names of removed
fields (
reserved 3, 4; reserved "old_name";) so future edits can't accidentally reassign them. - Changing field type is usually breaking even when it looks
compatible (e.g.
int32→int64works on some languages, breaks on others). - Enums are "open" by default in proto3 — a consumer may receive
unknown numeric values; use
(validate.rules).enum.defined_onlyto close the set at the validation layer. - Adding a new field to a
oneofor moving a field in/out of aoneofis wire-compatible but can change which-branch-is-set semantics in subtle ways; avoid. requiredwas dropped in proto3 explicitly because relaxing it in a proto2 schema was effectively impossible (see concepts/proto3-explicit-optional).
Design practices (Lyft Media, 2024-09-16)¶
Two principles + five practices from sources/2024-09-16-lyft-protocol-buffer-design-principles-and-practices:
- Clarity — concepts/clarity-over-efficiency-in-protocol-design
- Extensibility — concepts/extensibility-protocol-design
- Reserve 0 as
UNKNOWN— concepts/unknown-zero-enum-value - Name fields with units — concepts/unit-suffix-field-naming
- Prefer
oneofover enum + optional field — patterns/oneof-over-enum-plus-field - Explicit
optional/ wrapper types for primitives — concepts/proto3-explicit-optional - Declarative validation in the
.proto— patterns/protobuf-validation-rules via systems/protoc-gen-validate - Cross-entity constants via custom
EnumValueOptionsextensions — patterns/protobuf-cross-entity-constants
Seen in¶
- sources/2024-09-16-lyft-protocol-buffer-design-principles-and-practices — canonical protobuf design-practices post on the wiki. Lyft Media's distillation of team conventions into two principles (clarity, extensibility) + five practices; final consolidated example message demonstrates all of them simultaneously. First Lyft source on this wiki.
Related¶
- systems/grpc — the RPC framework that canonicalised protobuf as its IDL
- systems/protoc-gen-validate — validation plugin (PGV / protovalidate)
- systems/capnproto — alternative zero-copy schema+format
- concepts/schema-evolution — general framing; protobuf's wire format is designed around it
- concepts/backward-compatibility — the overarching discipline
- concepts/contract-first-design —
.protois the contract; codegen produces the implementations - concepts/schema-driven-interface-generation — pattern protobuf instantiates