SYSTEM Cited by 1 source

Protocol Buffers (protobuf)¶

Protocol Buffers (protobuf.dev) is Google's open-source, language-neutral, schema-driven binary serialization format and the default IDL for gRPC. A .proto file declares messages, enums, and service RPCs; the protoc compiler generates typed code for C++ / Java / Python / Go / C# / Swift / Kotlin / JavaScript / etc. Messages serialise to a compact tag-length-value binary format, and the schema is designed for forwards-and-backwards-compatible evolution — unknown fields are preserved on deserialise/re-serialise, field numbers (not names) are load-bearing on the wire, and the default-value semantics are chosen to tolerate added and removed fields.

Key language features (proto3)¶

message — record type with numbered fields. Field number, not name, is what appears on the wire.
enum — named integer constants. 0 must be present and is the default value for enum-typed fields; canonical convention is to reserve 0 as UNKNOWN (see concepts/unknown-zero-enum-value).
oneof — tagged union; exactly one of the contained fields may be set at a time. The wire encoding only includes the set branch. See patterns/oneof-over-enum-plus-field.
optional (reintroduced in proto3 3.15, 2021) — on singular scalars, adds HasField() presence semantics so callers can distinguish "absent" from "equals type default". Before 3.15, the workaround was the google.protobuf.*Value wrapper types (StringValue, UInt32Value, …). See concepts/proto3-explicit-optional.
map<K, V> — key/value collection; equivalent to a repeated message of {K, V} pairs on the wire.
Well-known types — google.protobuf.Timestamp, Duration, Any, the *Value wrappers; standardised imports every codegen supports.

Why protobuf (vs JSON, vs Cap'n Proto, vs Avro)¶

From the 2024-09-16 Lyft Media post:

Efficiency — denser on wire than JSON, faster encode/decode.
Forward/backward compatibility — tag-based encoding tolerates schema drift by design.
Multi-language codegen — Python / Swift / Kotlin / Java / Go / C++ / TS all first-class.
Rich tooling — Lyft specifically calls out "rich internal tooling at Lyft, and widespread use in both mobile-to-server and server-to-server domains."
Validation extensions — declarative validation via protoc-gen-validate / protovalidate over the same .proto surface.

Contrast:

JSON — human-readable, ubiquitous, but larger wire size, weaker typing, and no first-class schema evolution story.
Cap'n Proto — zero-copy memory representation ("the in-memory layout is the wire"); protobuf encodes/decodes which costs CPU but gives flexibility. Cap'n Proto drops its schema language in the JS-native Cap'n Web successor.
Avro — schema stored with the data or in a registry, names on the wire, more dynamic; protobuf uses static codegen and field numbers.

Compatibility rules (must-know)¶

Never reuse a field number. Reserve numbers / names of removed fields (reserved 3, 4; reserved "old_name";) so future edits can't accidentally reassign them.
Changing field type is usually breaking even when it looks compatible (e.g. int32 → int64 works on some languages, breaks on others).
Enums are "open" by default in proto3 — a consumer may receive unknown numeric values; use (validate.rules).enum.defined_only to close the set at the validation layer.
Adding a new field to a oneof or moving a field in/out of a oneof is wire-compatible but can change which-branch-is-set semantics in subtle ways; avoid.
required was dropped in proto3 explicitly because relaxing it in a proto2 schema was effectively impossible (see concepts/proto3-explicit-optional).

Design practices (Lyft Media, 2024-09-16)¶

Two principles + five practices from sources/2024-09-16-lyft-protocol-buffer-design-principles-and-practices:

Clarity — concepts/clarity-over-efficiency-in-protocol-design
Extensibility — concepts/extensibility-protocol-design
Reserve 0 as UNKNOWN — concepts/unknown-zero-enum-value
Name fields with units — concepts/unit-suffix-field-naming
Prefer oneof over enum + optional field — patterns/oneof-over-enum-plus-field
Explicit optional / wrapper types for primitives — concepts/proto3-explicit-optional
Declarative validation in the .proto — patterns/protobuf-validation-rules via systems/protoc-gen-validate
Cross-entity constants via custom EnumValueOptions extensions — patterns/protobuf-cross-entity-constants

Seen in¶

sources/2024-09-16-lyft-protocol-buffer-design-principles-and-practices — canonical protobuf design-practices post on the wiki. Lyft Media's distillation of team conventions into two principles (clarity, extensibility) + five practices; final consolidated example message demonstrates all of them simultaneously. First Lyft source on this wiki.

systems/grpc — the RPC framework that canonicalised protobuf as its IDL
systems/protoc-gen-validate — validation plugin (PGV / protovalidate)
systems/capnproto — alternative zero-copy schema+format
concepts/schema-evolution — general framing; protobuf's wire format is designed around it
concepts/backward-compatibility — the overarching discipline
concepts/contract-first-design — .proto is the contract; codegen produces the implementations
concepts/schema-driven-interface-generation — pattern protobuf instantiates