Skip to content

PATTERN Cited by 1 source

Field-level sensitivity tagging

Shape

Attach a sensitivity category to every database column in a central schema. Propagate the tags into the application server (queryable at runtime) and into the data warehouse (queryable at analytics time). Build enforcement / detection / audit systems that consume the tags, so adding a new sensitive field is a tagging decision that automatically enrolls the field into all downstream systems — no per-system allowlist maintenance.

The leverage: one annotation → many enforcement consumers. The tags stay stable; enforcement code evolves independently.

Reference instance — FigTag

From systems/figtag (Source: sources/2026-04-21-figma-visibility-at-scale-sensitive-data-exposure):

  • Every DB column is annotated with a category describing sensitivity + intended usage.
  • Annotations stored in a central schema (single source of truth, not scattered across service configs).
  • Annotations propagate to the data warehouse → makes it easy to determine a column's sensitivity at query time.
  • A specific category, banned_from_clients, flags fields that must not be returned in API responses under normal circumstances (security identifiers, billing, other PII).

Consumers of the tag include the Sensitive Data Analyzer — an ActiveRecord callback records banned_from_clients values loaded during sampled requests into request-local storage; the after filter compares serialized JSON against recorded values to find leaks.

Properties

  1. Central, not per-service. One schema. One vocabulary.
  2. Column-level, not table-level. Real tables mix sensitive and public fields.
  3. Runtime-queryable so middleware can hook it.
  4. Warehouse-queryable so analytics + audit can aggregate exposure.
  5. Consumed by many systems — response sampling, log redaction, access control, export filters, test suite guards.

Integration pattern: ORM callback → request-local storage

The Figma-style integration with a response-sampling detector:

  1. ORM fires callback on every record load with a tagged column.
  2. If the request is being sampled, callback copies the sensitive value into a request-local dictionary keyed by column.
  3. After the handler produces the response body, the after-filter reads the dictionary and compares against the serialized JSON.
  4. Matches = findings.

The same substrate supports other enforcement stances:

  • Log redaction: hook the logger, consult the tag, scrub.
  • Warehouse ACL: restrict reads on tagged columns to specific roles.
  • Export guards: filter tagged columns from tenant data exports.

Anti-patterns

  • Default-allow untagged columns. A new column lands without a tag → detection has no hook → leaks silently. Enforce default-deny (untagged = restricted, with a migration lint).
  • Multi-axis semantics packed into one tag. "pii_high" conflates kind + tier; prefer orthogonal axes (category + tier).
  • Tagging only the DB. Derived fields (computed in the application, not stored) need tagging too, or they slip through.
  • Review debt. Tags outlive their original intended usage; periodic re-tagging review is needed.
  • Propagation lag without reconciliation. If the warehouse tag propagation is eventually consistent, audits can miss new sensitive columns for the propagation window; reconcile against the central schema.

When to build this

  • Many downstream systems need to ask "is this field sensitive?"
  • New sensitive fields are added frequently (schema evolves faster than the security team can track manually).
  • Compliance requires auditable enumeration of sensitive data locations (SOC 2, GDPR, HIPAA).
  • The same data category needs different treatment in different systems (visible in admin UI, redacted in logs, excluded from API responses) — one tag, many policies.

Seen in

Last updated · 200 distilled / 1,178 read