Skip to content

CONCEPT Cited by 1 source

Governed tag

What it is

A governed tag is a piece of classification metadata attached to a data-catalog object (catalog, schema, table, column) where the tag itself is a managed primitive: defined once in an account-level or organization-level vocabulary, with explicit permissions on who can create the tag, who can apply it, and who can use it in policies.

Contrast with ad-hoc tagging (anyone with object-write permission can apply any string as a tag): governed tags are versioned vocabulary entries, and applying the wrong tag is a permission error rather than a free-form string.

What makes it "governed"

Three properties distinguish a governed tag from an ad-hoc tag:

  1. Centralized vocabulary — the set of available tags is itself managed, not user-supplied at apply time. New tags require a CREATE permission held by a designated role; the universe of tags is therefore a controlled list.
  2. Inheritance from parent to child — tagging a parent object (e.g., a catalog) automatically tags every descendant (schemas, tables, columns), unless overridden. This makes bulk tagging a constant-cost operation rather than O(n_columns).
  3. Permission separation — the right to create new tag vocabulary entries (MANAGE/CREATE on the taxonomy) is distinct from the right to apply tags to data (APPLY on the data object), which is again distinct from the right to own the data (OWNER on the table). Three roles, three permissions, one primitive — see concepts/separation-of-duties-data-governance.

Why it matters

Governed tags are the attribute substrate for attribute-based access control (ABAC) in data-warehouse / data-lake-house settings. ABAC policies evaluate against tag values; if the tag vocabulary is not governed, the policies have no stable references to evaluate against and policy correctness becomes a string-matching exercise.

The same property makes governed tags a load-bearing input to automated data classification: a classifier produces tag values, so the output-tag-language must be stable and shared across automated and human taggers. If stewards apply pii_ssn while the classifier emits pii.ssn, downstream ABAC evaluation breaks silently.

Distinguishing from sibling primitives

  • Ad-hoc tag — same shape (key-value metadata) but no centralized vocabulary or permissions on creation. The form most pre-2020 warehouse systems shipped.
  • data classification tagging — the broader umbrella concept; governed tags are one realisation. Figma's FigTag is an application-schema-level realisation; Unity Catalog Governed Tags is a catalog-substrate realisation.
  • data annotation (Meta Policy Zones) — same primitive, different consumer (information-flow control rather than ABAC + warehouse governance).
  • concepts/condition-tag-iam (AWS IAM tags) — sibling at the cloud-provider IAM layer; tags as ABAC conditions on cloud resources rather than warehouse columns.

The governed-tag framing slightly emphasises the vocabulary governance aspect over the classification semantics aspect: a tag is a vocabulary entry first, a sensitivity label second.

Inheritance + bulk tagging

The inheritance property is what makes governed tags scale to metastores covering thousands of catalogs. Without inheritance, every new column needs an explicit tag-application step or it is unprotected by ABAC policies. With inheritance, a catalog tagged sensitivity:confidential propagates that tag to every contained column by default — and ABAC policies referencing the tag start matching every contained column automatically.

The cost: override semantics become load-bearing. A column that needs to be sensitivity:public inside an otherwise sensitivity:confidential catalog must override the inherited tag, and the language for expressing the override (priority? explicit delete? hierarchical match?) becomes part of the user-facing model.

Permission separation in practice

Three permission axes for one primitive:

Permission Held by Scope Example violation
MANAGE / CREATE on taxonomy Account admins What tags exist A workspace admin tries to invent a secret-tier-7 tag without permission
APPLY on data object Stewards / data producers Which tables get which tags A data producer applies sensitivity:public to a column they should be tagging sensitivity:confidential
OWNER on data Data producer / domain team Who owns the table itself A different team modifies the table schema

Cross-cutting these permissions onto specialised roles is what unlocks separation of duties — no one person can both define the tag vocabulary, apply tags, and own data.

Seen in

  • sources/2026-05-13-databricks-abac-row-filtering-and-column-masking-policies-governed-tags — Unity Catalog GA of governed tags as an account-level vocabulary with parent-to-child inheritance, full SQL/API/UI/Terraform lifecycle, and workspace-vs-account-admin CREATE/MANAGE split. "Governed tags are the attribute foundation that ABAC policies build on: an account-level vocabulary of keys and values that standardizes how data is described across an account, with permissions that control who can apply which tags to which objects."
Last updated · 542 distilled / 1,571 read