Skip to content

CONCEPT Cited by 1 source

Cross-engine ABAC

Cross-engine ABAC is the architectural shape in which a single set of attribute-based access control policies — defined once in a catalog — is evaluated and enforced uniformly across multiple compute engines, including engines that are not the catalog vendor's first-party compute.

The structurally hard problem this resolves: in an open lakehouse, the same table can be accessed by Spark, Trino, Flink, Snowflake, DuckDB, pandas, etc. If governance lives in each engine independently, an organisation must implement and trust the same row-filter / column-mask logic in every engine — and any engine that ships without that logic (or with a buggy version) is a hole in the policy fence. Cross-engine ABAC moves policy evaluation out of the engine and into the catalog, so the engine receives data that has already been filtered by policy.

Mechanism (canonical Unity Catalog instance)

"With cross-engine attribute-based access controls (ABAC) now in Beta, Unity Catalog extends attribute-based access control to Iceberg clients using the Iceberg REST Catalog Scan APIs.

How it works: Administrators define policies once in UC, including column masks, row filters, and tag-based policies. When an external Iceberg engine requests access, UC evaluates the applicable policies during server-side scan planning. UC then returns a filtered scan plan so the engine only reads authorized data when processing the query."

(Source: sources/2026-05-28-databricks-advancing-apache-iceberg-on-databricks-iceberg-v3-ga-open-sharing-and-unified-governance)

The key wire-protocol move is the Iceberg REST Catalog Scan Planning API (added in Iceberg 1.11). Engines that implement this client send a plan-scan request to the catalog; the catalog evaluates ABAC policies during planning; the response is a filtered scan plan — file list + column projections + residual filters — that already reflects the policy. The engine reads only the data that scan plan permits.

Architectural shape

Multiple engines, one policy plane:

   ┌─────────┐    ┌─────────┐    ┌─────────┐    ┌─────────┐
   │  Spark  │    │  Trino  │    │  Flink  │    │ DuckDB  │
   └────┬────┘    └────┬────┘    └────┬────┘    └────┬────┘
        │              │              │              │
        │   plan-scan(table, predicate, columns, …)  │
        └──────────────┬──────────────┬──────────────┘
                       │              │
                       ▼              ▼
              ┌─────────────────────────────┐
              │  Iceberg REST catalog       │
              │  (Unity Catalog)            │
              │                             │
              │  ABAC engine evaluates:     │
              │   - tag-based policies      │
              │   - row filters             │
              │   - column masks            │
              │  against principal +        │
              │  governed tags +            │
              │  query attributes           │
              │                             │
              │  Returns filtered scan plan │
              └─────────────────────────────┘
              data files in object storage
              (engine reads only authorised)

Properties

  1. Policy is defined once. Administrators don't author per-engine variants of the same row filter / column mask.
  2. Policy is evaluated server-side. The catalog is the policy decision point (PDP). Engines are policy enforcement points (PEPs) only insofar as they honour the filtered scan plan they receive.
  3. The Iceberg REST scan-planning API is the boundary. Any engine that implements the Iceberg 1.11 scan-planning client gets ABAC for free; engines below the floor cannot benefit even on the same tables.
  4. Engine choice is preserved. Customers don't have to channel all queries through a single first-party engine to get governance — "Customers can use the best engine for each workload while maintaining one governance model across the lakehouse."

What this replaces

The pre-existing pattern was either (a) all queries via the catalog vendor's compute (governance enforced because there's only one engine), or (b) engine-specific policy implementations (each engine implements row filters / column masks against the catalog's policy schema, with all the consistency / drift problems that implies). Cross-engine ABAC removes the choice — engine choice is preserved without giving up governance.

Caveats

  • Iceberg 1.11 floor. Engines on older Iceberg versions don't get ABAC. Compatibility-floor for the canonical UC instance is fixed at Iceberg 1.11.
  • Trust model not fully described. The announcing source describes the catalog returning a "filtered scan plan" but doesn't fully clarify whether column masks (which may need per-row UDF evaluation against the data files) are entirely server-side or rely on the engine to honour residual mask instructions during read.
  • Latency cost on the query path. Server-side scan planning adds a catalog-side compute step before each query; numbers undisclosed.
  • First canonical instance is Unity Catalog (Beta). Other Iceberg REST catalog implementations may not support cross-engine ABAC even though they speak the same wire protocol; the policy-evaluation engine is catalog-vendor-specific.
  • Doesn't extend to non-Iceberg formats. Cross-engine ABAC as specified rides on the Iceberg REST scan-planning API. Engines querying Delta or Hudi tables through different protocols would need their own equivalents.
  • Composition with Iceberg v3 features undisclosed. How ABAC row filters interact with deletion-vector-marked rows, row-tracking, or VARIANT-typed columns is not addressed in the announcing source.

Seen in

Last updated · 542 distilled / 1,571 read