Skip to content

PATTERN Cited by 1 source

Protobuf cross-entity constants via custom options

Summary

Attach string (or other primitive) constants to enum values via custom protobuf option extensions on google.protobuf.EnumValueOptions, so that a mobile client and a backend service share the literal value ("#tag1", a feature-flag key, a schema-version string) through the single .proto file rather than through parallel hard-coded constants in two codebases.

Problem

A mobile app and a backend service both need to agree on a literal string tied to an enumerated concept:

enum EventTag {
    EVENT_TAG_UNKNOWN = 0;
    EVENT_TAG_1       = 1;   // represents "#tag1"
    EVENT_TAG_2       = 2;   // represents "#tag2"
}

The enum value (integer) flows over the wire cleanly, but somewhere — when displaying, logging, publishing to a third-party SDK, formatting an analytics event — one of the teams needs the actual string "#tag1".

The usual solutions each have drawbacks:

  • Parallel constants in each language. Mobile hardcodes "#tag1"; backend hardcodes "#tag1". If they drift, bugs surface at integration time.
  • Lookup table per language. Each codebase maintains its own Map<EventTag, String>. Still two sources of truth.
  • String-valued enum field. Send "#tag1" on the wire. Wastes bytes on every message; opens the field to arbitrary typos; violates the clarity argument that the enum was supposed to express.

Solution

Declare a custom option on google.protobuf.EnumValueOptions and attach the literal to each enum value:

import "google/protobuf/descriptor.proto";

extend google.protobuf.EnumValueOptions {
    // Use a distant number to avoid accidental collisions.
    // For a small project, picking an arbitrary large prime number
    // should be safe enough.
    string const_value = 11117;
}

enum EventTag {
    EVENT_TAG_UNKNOWN = 0 [(const_value) = ""];
    EVENT_TAG_1       = 1 [(const_value) = "#tag1"];
    EVENT_TAG_2       = 2 [(const_value) = "#tag2"];
}

Every code generator for every language that processes this .proto sees the const_value annotation attached to each enum entry. The .proto file is now the single source of truth for both the enum-to-integer mapping (wire level) and the enum-to-string mapping (display / SDK / analytics level).

Reading the constant at runtime

Python example (from the Lyft post):

from your_pb import event_pb2

tag_value = (
    event_pb2.EventTag.DESCRIPTOR
    .values_by_name[event_pb2.EventTag.Name(event_pb2.EVENT_TAG_1)]
    .GetOptions()
    .Extensions[event_pb2.const_value]
)
# tag_value == "#tag1"

Other languages expose equivalent descriptor APIs (Go: proto.GetExtension(desc.Options(), event_pb.E_ConstValue); Java: EventTag.EVENT_TAG_1.getValueDescriptor().getOptions() .getExtension(EventPb.constValue)).

The generated-code API is per-language-ugly, so most teams wrap it in a one-line helper:

def const_value_for(enum_value):
    return (enum_value.DESCRIPTOR
            .values_by_name[enum_value.name]
            .GetOptions()
            .Extensions[event_pb2.const_value])

Extension field-number choice

The field number for the custom option is global across all extensions of google.protobuf.EnumValueOptions in a given build environment; two extensions picking the same number will collide. Guidance from the 2024-09-16 Lyft post:

"Use a distant number to avoid accidental collisions. For a small project, picking an arbitrary large prime number should be safe enough. For larger projects, tooling can be built to manage field numbers with safety guarantees."

Lyft's example uses 11117. Google reserves 50000–99999 for internal extensions and runs an Extension Number Registry for public extensions; organisations above a certain size typically maintain their own internal registry.

When to use this pattern

Good fits:

  • Analytics tag literals — the user-visible or third-party-SDK string that corresponds to an internal enum.
  • Feature-flag keys — the string identifier for a flag whose enum-valued handle is passed around in code.
  • Cross-service schema version strings — a logical version that both ends need to agree on literally.
  • Error codes with both a numeric (protocol) representation and a stable string (log-facing) representation.

Poor fits:

  • Values that change. The .proto is the commit-time contract; changing a constant requires a coordinated deploy of every consumer that reads it. Lyft's post flags this explicitly:

"It's recommended to exercise caution when using this technique. It is most suitable for cases where the constant values are never expected to change, or where you have complete control over deployment of entities that will be consuming the protocol."

  • Fully external consumers. A third-party SDK that decodes the wire enum 2 and maps to its own string representation won't see or use this custom option.
  • High-cardinality mappings. A 500-entry enum with per-entry constants is a configuration file masquerading as a schema; shipping the map as data and looking it up at runtime is cleaner.

Multiple options per enum value

Custom options compose. A given enum value can carry a display string, an analytics tag, a numeric category ID, all in one declaration:

extend google.protobuf.EnumValueOptions {
    string display_name  = 11117;
    string analytics_tag = 11118;
    int32  category_id   = 11119;
}

enum EventTag {
    EVENT_TAG_1 = 1 [
        (display_name)  = "First event",
        (analytics_tag) = "#tag1",
        (category_id)   = 100
    ];
}

Each custom option needs its own distinct field number.

Seen in

  • sources/2024-09-16-lyft-protocol-buffer-design-principles-and-practicescanonical pattern statement. Lyft Media's post presents this as the last of its five protobuf practices; includes the specific field-number guidance, the Python read-path code, and the caution note about changeable vs stable values. Lyft uses it in production for ad-event tag literals shared between iOS / Android / Python backend.
Last updated · 319 distilled / 1,201 read