CONCEPT Cited by 1 source
Entity URI namespace¶
Definition¶
An entity URI namespace is a catalog-design primitive that assigns every entity a globally-unique URI of the form:
The URI encodes:
- Routing — which source system to call for hydration.
- Identity — a globally-unique handle that doesn't collide across source systems.
- Type — the entity category, for filtering and indexing.
This is the precondition for cross-system graph traversal: foreign keys can point to URIs across source systems without ambiguity, and a single dispatcher can fan out hydration calls correctly.
The Netflix MDS instance¶
"The normalization process standardizes field names and formats. For example, platform-specific IDs become global AIP URIs, owner_emails becomes owners with resolved user URIs, and labels become tags. Foreign keys like pipeline_run_id are transformed into entity references." — sources/2026-05-04-netflix-democratizing-machine-learning-building-the-model-lifecycle-graph
Examples from the post (worked-example identifiers, illustrative):
aip://model/registry/ranking-model-v5-20XX0101aip://pipeline-run/orchestrator/train-weekly-ranking-20XX0101aip://user/identity/alice
Decomposition:
| URI part | Meaning |
|---|---|
aip |
Scheme — "AI Platform" namespace owned by Netflix MDS |
model / pipeline-run / user |
Entity type — drives indexing and UI dispatch |
registry / orchestrator / identity |
Source system — drives hydration callback routing |
ranking-model-v5-20XX0101 etc. |
Source-system-local identifier |
Why a URI namespace beats raw IDs¶
A naive design would store source-system IDs directly. Problems:
- Collisions. A model registry might issue ID
123for a model; an experimentation platform might issue123for a test. A foreign keytarget_id: 123is ambiguous. - No type information. "What is entity 123?" requires a separate catalog lookup or a schema-encoded type column.
- No routing information. When a foreign key points to another entity, the consumer needs out-of-band knowledge of which source system to call to hydrate it.
A URI bundles all three into a single string:
aip://model/registry/123
^^^ ^^^^^ ^^^^^^^^ ^^^
| | | |
| | | +-- source-system-local ID (no global uniqueness needed)
| | +----------- routing key (which API to call)
| +----------------- type (for filter / dispatch)
+----------------------- namespace owner
A foreign key written as a URI is self-describing — downstream consumers can resolve it without additional context.
Why this enables cross-system graph traversal¶
Without normalization to URIs:
"Without normalization, downstream consumers would need to understand every source system's schema." — sources/2026-05-04-netflix-democratizing-machine-learning-building-the-model-lifecycle-graph
Each consumer would have to know:
- That
pipeline_run_idfrom Model Registry refers to the orchestrator's namespace. - That
owner_emailsfrom Model Registry are emails butowner_idfrom Pipeline Orchestrator are numeric LDAP IDs. - That A/B test cells are referenced by
(test_id, cell_number)in some sources but by a flatcell_idin others.
Normalizing to URIs makes all of this uniform. A relationship
edge (source_uri, edge_type, target_uri) is a complete fact
without further context.
Generalizes to non-MDS catalogs¶
The pattern recurs across catalog / metadata / lineage systems:
- AWS resource ARNs —
arn:aws:s3:::my-bucket/object-key— same shape: scheme, service, region, account, resource. - Kubernetes resource references —
apiVersion + kind + namespace + name— encoded across multiple fields but semantically equivalent. - Datadog / OpenTelemetry resource attributes —
service.name,cloud.provider, etc. — flatter than a URI but same information. - Spotify Backstage entity refs —
<kind>:<namespace>/<name>— near-identical shape.
The URI form (vs. multi-field structured ID) has one extra property: it's a string, so it can be used directly as a key in any KV store, a token in a search index, or a foreign key column in any DB. Multi-field structured IDs require encoding to a string at every interop point.
Design considerations¶
When designing such a namespace:
- Stable scheme. Don't rename your scheme later — every reference becomes wrong.
- Stable source-system labels. Same constraint, scoped to one level deeper.
- Source-system-local IDs need not be globally unique. The URI scopes them.
- Don't encode mutable state. Don't put the model version number or environment tag in the URI; those should be attributes, not identity.
- Reserve a generic
unknownsource-system bucket for entities that pre-exist a proper source-system integration.
Seen in¶
- sources/2026-05-04-netflix-democratizing-machine-learning-building-the-model-lifecycle-graph — Netflix MDS uses
aip://<entity-type>/<source-system>/<source-id>URIs as the global identifier across the model lifecycle graph.