PATTERN Cited by 1 source
Oneof over enum-plus-field¶
Summary¶
When a protobuf message has a variant nature
— different kinds carry different payloads — model it as a oneof
tagged union rather than as a discriminator enum plus a grab-bag
of per-kind optional fields. The oneof makes the kind and the
payload the same field, eliminates the implicit "kind == C means
only payload_size is valid" contract, and drops wire size because
unset branches don't serialise.
Problem¶
The naive schema for a variant-nature message:
// ⚠️ Anti-pattern
message Event {
enum Kind {
EVENT_KIND_A = 0;
EVENT_KIND_B = 1;
EVENT_KIND_C = 2;
}
uint64 id = 1;
uint64 timestamp = 2;
Kind kind = 3;
uint32 payload_size = 4; // specific to EVENT_KIND_C
}
Multiple problems compound with every new kind:
- Implicit correctness contract. If
kind == EVENT_KIND_C, ispayload_sizerequired to be set? Ifkind == EVENT_KIND_A, is it required to be absent? The schema doesn't say. Every consumer and producer has to maintain this correspondence in code. - Conditional branches on every access. Readers can't access
payload_sizewithout first checkingkind, and must keep the check in sync with every new kind. - Cross-kind field pollution. Kind-A-only fields, kind-B-only fields, and kind-C-only fields all sit at the same level; the schema can't express which belong together.
- Every new kind compounds the problem. Adding
EVENT_KIND_Dmeans adding its per-kind fields to the top level and updating the cross-field validation logic everywhere it exists. - Wire overhead. Kind-A messages still carry a field tag for
payload_size(even if empty).
As the 2024-09-16 Lyft post frames it:
"It's nice to work with a protocol that's structured in a way where it explains itself; both as perceived immediately and as proven by iterating on it long term."
The anti-pattern violates that at the schema layer and pushes the burden into caller code.
Solution: oneof¶
Model the variant as a oneof union with one sub-message per kind:
message Event {
uint64 id = 1;
google.protobuf.Timestamp timestamp_utc = 2;
oneof data_kind {
option (validate.required) = true; // enforce one branch set
EventDataA data_a = 3;
EventDataB data_b = 4;
EventDataC data_c = 5;
}
}
message EventDataA {} // empty when no per-kind fields
message EventDataB {}
message EventDataC {
optional uint32 payload_size_bytes = 1;
}
Properties this buys:
- The discriminator and the payload are the same field. No way to desync them; no way to set a B-payload for an A-kind.
- Each kind's fields live together in their own message. Adding a new kind-specific field only needs to touch one message.
- Generated code exposes
WhichOneof('data_kind')or equivalent, making dispatch explicit and exhaustive-checkable in strongly-typed languages. - Only the set branch serialises.
data_amessages don't carry adata_c.payload_size_byteswire tag. - Adding
EventDataDis a one-line additive change and doesn't touch existing consumers' code until they opt in to handle it.
Example consumer code:
kind = event_pb.WhichOneof('data_kind')
if kind == 'data_a':
handle_a(event_pb.data_a)
elif kind == 'data_b':
handle_b(event_pb.data_b)
elif kind == 'data_c':
handle_c(event_pb.data_c.payload_size_bytes
if event_pb.data_c.HasField('payload_size_bytes')
else None)
else:
handle_unknown(kind) # new kind added after this consumer
# was built
Enforce one-branch-required at the validation layer¶
By default a oneof does NOT require any of its branches to be
set — WhichOneof can return None. The 2024-09-16 Lyft post
flags this as a counterintuitive default and prescribes the fix via
protoc-gen-validate / protovalidate:
oneof data_kind {
option (validate.required) = true;
EventDataA data_a = 3;
EventDataB data_b = 4;
EventDataC data_c = 5;
}
With option (validate.required) = true; the generated validator
rejects a message where none of data_a / data_b / data_c is
set. See patterns/protobuf-validation-rules — validation still
has to be explicitly invoked.
Shared-field refactor¶
If two or three kinds share a common subset of fields, duplicating the fields across their per-kind messages is worth the minor repetition cost:
message EventDataA {
string actor_id = 1; // A and B both have actor_id
}
message EventDataB {
string actor_id = 1;
string target_id = 2; // B-specific
}
Alternative: promote the shared field to the outer Event message
when it's genuinely present for every kind. The trap to avoid is
promoting a field to the outer level when most but not all kinds
have it — that reintroduces the optional-field ambiguity the oneof
was supposed to eliminate.
Extensibility note¶
Moving a field into or out of a oneof is technically
wire-compatible but can change which-branch-was-set state on
consumers that cache WhichOneof() results. The 2024-09-16 Lyft
post calls this out as a common protobuf pitfall:
"there's a few common pitfalls in protobuf, which often revolve around changing the type of a field and rearranging oneof groupings."
Design the oneof membership up-front and avoid moving fields in
and out after deploy.
When NOT to use oneof¶
- No branch-specific fields exist — if every kind has the same
shape and only the discriminator differs, a simple enum (with
_UNKNOWN = 0) is fine. - Mutually exclusive but orthogonal state — if two variants can
legitimately coexist (e.g. status + reason), they're separate
fields, not a
oneof. - More than 10-20 branches — at that cardinality, a
oneofgets awkward to maintain; consider a dedicated container or a polymorphic message ecosystem.
Seen in¶
- sources/2024-09-16-lyft-protocol-buffer-design-principles-and-practices
— canonical motivation. Lyft Media's post uses the
discriminator-enum anti-pattern as the opening example and
evolves it into a
oneofacross three code-block iterations; flagsoption (validate.required) = trueas the missing-default fix; the final consolidated example combines the pattern with unit-suffixed field names,UNKNOWNenum sentinels, explicitoptional, and inline validation.
Related¶
- systems/protobuf — the schema language
- systems/protoc-gen-validate — the validation plugin that enforces one-branch-required
- concepts/clarity-over-efficiency-in-protocol-design — the principle this pattern instantiates
- concepts/extensibility-protocol-design — the sibling principle
- concepts/design-away-invalid-states — related idea at the type-system level
- concepts/unknown-zero-enum-value — complementary fix for enum-only discriminators that stay as enums
- concepts/proto3-explicit-optional — complementary fix for presence on primitive fields
- patterns/protobuf-validation-rules — declarative validation the pattern composes with