SYSTEM Cited by 1 source
Iceberg REST Catalog Scan Planning API¶
The Iceberg REST Catalog Scan Planning API is an extension to the Iceberg REST Catalog protocol — added in Iceberg 1.11 — that lets the catalog plan a scan server-side and return a filtered scan plan to the requesting engine, instead of having the engine plan the scan client-side from the table's metadata.
The API is the wire-level mechanism that makes cross-engine ABAC possible: governance policies (row filters, column masks, tag-based policies) are evaluated by the catalog server during scan planning, and the engine receives a scan plan that already reflects those policies — so the engine can read only authorised data, even when the engine is not the catalog vendor's first-party compute.
Architectural shape¶
Iceberg client engine Iceberg REST catalog server
(Spark / Flink / Trino / (e.g., Unity Catalog)
DuckDB / Snowflake / etc.)
│ │
│ ── plan-scan(table, filter, snapshot) ─► │
│ │
│ │ evaluate ABAC policies
│ │ against principal +
│ │ table tags + filter
│ │
│ │ apply row filter
│ │ + column mask UDFs
│ │
│ │ produce filtered scan
│ │ plan (file list,
│ │ residual filters,
│ │ column projections)
│ │
│ ◄── filtered scan plan ───────────────── │
│ │
│ read data files │
│ apply residual filters │
│ produce result set │
│ │
The structurally important property is that policy evaluation lives on the catalog side, not the engine side. The engine is a consumer of an already-filtered scan plan. This makes the catalog the single point at which governance policies are evaluated for every engine that speaks the API, rather than each engine implementing its own policy-enforcement code path.
Iceberg 1.11 floor for compatibility¶
The compatibility floor for ABAC-aware scan planning is Iceberg 1.11. From the announcing source:
"Any engine, such as Apache Spark or DuckDB, which implements the Iceberg REST catalog scan planning client (added in the Iceberg 1.11 release) can access data with ABAC enforced."
Engines on older Iceberg versions either don't implement the scan-planning client at all (and continue to plan scans client-side from metadata, bypassing server-side policy evaluation) or fall back to a non-ABAC-aware scan path. The Iceberg-1.11 floor effectively defines the cross-engine governance perimeter: engines below the floor cannot benefit from cross-engine ABAC even if they read the same tables.
Why server-side scan planning matters architecturally¶
Before scan-planning APIs, the standard pattern for an Iceberg client was: (1) fetch the catalog's pointer to the current snapshot, (2) read the snapshot metadata files (manifest list, manifests) directly from object storage, (3) plan the scan locally — selecting the data files / column subset / partition pruning — entirely client-side. The catalog's role was metadata indirection only.
This pattern has three structural problems for governance:
- Policy enforcement requires a client-side library. Each engine has to implement row-filter / column-mask logic against the catalog's policy schema; engines that didn't implement it bypassed governance entirely.
- No central audit point for scan decisions. The catalog couldn't observe which files / columns / rows an engine actually planned to read.
- Trust boundary at the engine. The catalog had to trust the engine to honour policies; a misbehaving or compromised client could read everything.
Server-side scan planning resolves all three. The catalog knows what each engine intends to read, applies policies centrally, and returns only authorised files / columns / rows in the plan. Trust shifts from the engine to the catalog.
The remaining open question — not addressed in the announcing source — is whether the catalog still trusts the engine to honour the residual parts of the plan (e.g., column-mask UDFs that need to run at the engine on the data files), or whether all masks are applied catalog-side before the file list is returned. The post does not specify.
Beyond ABAC: the broader scan-planning role¶
While the announcing source emphasises ABAC as the canonical use case for the scan-planning API on Unity Catalog, server-side scan planning has broader implications:
- Catalog-mediated optimisation. The catalog knows table-level statistics that each engine would otherwise have to recompute. A central planner can produce a better-optimised scan than per-engine planning against the same metadata.
- Audit substrate. Every scan-plan request becomes an auditable event with principal, table, predicate, projected columns. This is the substrate for query-level audit logs without engine cooperation.
- Policy caching. The catalog can cache policy evaluations per (principal, table, query-fingerprint) to amortise the policy-evaluation cost across repeat queries.
These broader implications are not addressed in the announcing source — only ABAC enforcement is named — but the API's architectural shape supports them.
Caveats¶
- Wire-protocol details not in the announcing source. The exact request / response schema for plan-scan is not described in the announcing post; readers must consult the Iceberg 1.11 spec / docs for the protocol.
- Latency cost undisclosed. Server-side scan planning adds a catalog-side compute step on the query path; the announcing source provides no latency / throughput numbers for plan-scan.
- Not all ABAC policy types may be enforceable at scan-plan time. Row filters and simple column masks are clearly compatible with file-list / projection-pruning output; column masks that require row-level UDF evaluation (e.g., custom hash / format-preserving-encryption masks) may need to be applied at engine-read time. The post does not clarify which masks are server-side vs client-side.
- First production canonical instance is Unity Catalog. Other Iceberg REST catalog implementations (Apache Polaris, Snowflake Open Catalog, AWS Glue) may or may not support scan-planning + policy evaluation in the same way. The announcing source positions UC as the only catalog that delivers cross-engine ABAC via this API.
Seen in¶
- sources/2026-05-28-databricks-advancing-apache-iceberg-on-databricks-iceberg-v3-ga-open-sharing-and-unified-governance — Canonical wiki disclosure. The Iceberg-1.11 scan-planning client named as the compatibility floor for cross-engine ABAC; the catalog-evaluates-policies-during-server-side-scan-planning shape canonicalised verbatim.
Source¶
- Original announcement: https://www.databricks.com/blog/unity-catalog-and-next-era-apache-icebergtm
- Iceberg spec / REST catalog docs: https://iceberg.apache.org/
Related¶
- systems/apache-iceberg — table format the API serves.
- systems/unity-catalog — first canonical catalog implementation of policy-aware scan planning.
- systems/unity-catalog-abac — UC's ABAC engine; cross-engine extension uses this API.
- systems/uc-credential-vending — sibling open API at the auth boundary.
- concepts/cross-engine-abac — the governance shape this API enables.
- concepts/attribute-based-access-control — ABAC concept; cross-engine extension is a wire-protocol move.
- patterns/scan-planning-as-policy-enforcement-point — the architectural pattern: the catalog's scan-planning request is the policy-evaluation chokepoint.