CONCEPT Cited by 1 source
Data annotation (IFC labels)¶
Definition¶
In information flow control
systems like Meta's Policy Zones, a
data annotation is a metadata label attached to data assets
(e.g. BANANA_DATA) that encodes the privacy or policy category of
that data. The annotation is associated with a set of data flow
rules that govern where the data may flow, what purposes it may be
used for, and what transformations are safe.
Granularity levels (from the 2024-08-31 Meta PAI post)¶
Batch-processing systems (Presto, Spark):
- Table-level
- Column-level
- Row-level
- Potentially cell-level
Function-based systems (HHVM):
- Request parameters
- Database entries (variables loading database rows)
- Event log entries
- Return values of functions
Why annotations, not per-asset ACLs¶
Meta's 2024-08-31 framing: annotations + IFC propagation together give logical data separation at low compute cost on shared-code-and-storage infrastructure. ACLs (the point-checking alternative) require physical data separation into distinct assets per purpose — unworkable when the same underlying table feeds many product features.
Separating annotations from requirements¶
The canonical lesson from the post's "Lessons learned from adoption at scale" section: an initial monolithic annotation API that encoded the full flow-rule specification on every annotation "became increasingly complex, resulting in data annotation conflicts that were difficult to resolve" as multiple requirements composed on the same data. Meta's fix was to simplify data annotations to decouple data from requirements and separate data flow rules for different requirements.
Discovery + maintenance¶
- Initial identification: manual code inspection + Meta's ML-based classifier (cited in the post as "our scalable ML-based classifier to automatically identify data assets").
- Ongoing verification: PZM "verifiers to check the accuracy of asset annotations and control configurations."
Seen in¶
- sources/2024-08-31-meta-enforces-purpose-limitation-via-privacy-aware-infrastructure — canonical surfacing on this wiki.
Related¶
- concepts/information-flow-control — the primitive annotations make operational.
- concepts/purpose-limitation — the requirement class annotations encode at Meta.
- concepts/policy-lattice — Denning's lattice model; annotations are lattice elements.
- concepts/data-classification-tagging — adjacent Figma framing; field-level sensitivity tagging is the classification-axis sibling of IFC-axis annotation.
- concepts/data-flow-violation — the event triggered when an annotated flow reaches a non-annotated sink.
- patterns/separate-annotation-from-requirement — the simplification lesson.
- systems/meta-policy-zones / systems/meta-policy-zone-manager — canonical industrial system + tooling.
- companies/meta