Skip to content

SYSTEM Cited by 1 source

protoc-gen-validate / protovalidate

protoc-gen-validate (github.com/bufbuild/protoc-gen-validate) — PGV for short — is a protobuf plugin that generates per-language validation code from declarative rules attached to message fields. protovalidate (github.com/bufbuild/protovalidate) is its modern successor, maintained by Buf, that moves rule evaluation to a shared runtime (CEL-based) so the rule definitions don't need codegen per language. As of the 2024-09-16 Lyft post: PGV is flagged as stable-but-succeeded, protovalidate is the recommended starting point for new work.

What the rules look like

Validation constraints are attached as protobuf options inside the .proto:

import "validate/validate.proto";

message Event {
    string id = 1 [(validate.rules).string = {min_len: 1}];
    google.protobuf.Timestamp timestamp_utc = 2
        [(validate.rules).timestamp = {required: true}];
    oneof data_kind {
        option (validate.required) = true;
        EventDataA data_a = 3;
        EventDataB data_b = 4;
        EventDataC data_c = 5;
    }
}

Generated code exposes a Validate() / validator.validate(msg) entry point. Rule families the Lyft post calls out as most useful:

  • oneof validation: option (validate.required) = true; inside the oneof forces one branch to be present (default is none required).
  • message validation: (validate.rules).message.required = true for sub-message presence.
  • enum validation: (validate.rules).enum = { defined_only: true, not_in: [0] } closes the open-enum behaviour proto3 introduced and excludes the conventional UNKNOWN = 0 value (see concepts/unknown-zero-enum-value).
  • string validation: min_len, max_len, uuid, email, ip/ipv4/ipv6, uri, pattern (regex).
  • repeated validation: min_len, items (validates each element against its type's rules), unique: true for set-semantics.
  • map validation: min_pairs, keys, values, no_sparse: true to require non-null values for message-valued maps.
  • Wrapper types: the same rule family applies to google.protobuf.StringValue etc. as to the wrapped primitive.

Critical caveat — validation is not automatic

Generated validators must be invoked explicitly. Parsing a wire message does not trigger validation. The Lyft post flags this in bold:

"if a message is formed in violation of the stated rules, nothing will fail until its validator is invoked!"

Implication for trust boundaries: every handler that accepts a protobuf from an untrusted source (mobile client → backend, external API → internal service) must call Validate() or it has no enforcement at all. Forgetting it is a silent safety-net gap, not a compile error.

import protoc_gen_validate.validator
from your_pb.event_pb2 import Event as EventPB

event_pb = EventPB(...)
try:
    protoc_gen_validate.validator.validate(event_pb)
except protoc_gen_validate.validator.ValidationFailed as ex:
    raise ValueError(f'Protobuf validation error: {ex}')

Why the .proto is the right home for validation rules

Ordinary runtime validation sits in application code per-language; the rules drift across clients and services. Declaring them on the schema:

  • Single source of truth — server and every client language codegen the same rules from the same .proto.
  • Rule definitions are themselves expressed in protobuf — see validate.proto — so tooling, linting, and IDE support come for free.
  • Reviewable at schema-change time — a oneof gaining a new branch in a CR naturally surfaces any validate.required opinion change.

PGV → protovalidate migration

Per the Lyft post's footnote:

"Since recently, PGV has reached a stable state and has been succeeded by protovalidate. While the general idea remains the same, consider using the modernized solution when getting started with validation."

protovalidate differs from PGV along two axes the Lyft post gestures at without detailing: (1) it uses CEL (Common Expression Language) for rule evaluation, so one runtime library handles all languages rather than per-language codegen; (2) Buf's own rule namespace (buf.validate.field) supersedes the PGV validate.rules naming.

Seen in

Last updated · 319 distilled / 1,201 read