Skip to content

CONCEPT Cited by 1 source

Default-closed table allowlist

A default-closed table allowlist is a data-platform governance posture in which tables are inaccessible for querying until they have been reviewed, inverting the traditional "open by default, restrict by exception" default that most data warehouses ship with. New tables enter the platform in a pending state; an automated PII / data-classification scanner classifies columns; a human reviewer approves, overrides, or denies; only then is the table queryable.

The pattern is canonicalised in this wiki by Town Lake — Cloudflare's data platform — as documented in the 2026-05-28 launch post:

"Town Lake takes the opposite approach. Tables are inaccessible for querying until they have been reviewed. When a new database is connected to Trino or a new table is created, Skimmer scans it, classifies its columns, and registers it in the central allowlist as pending. Until a reviewer approves the table, and the specific columns within it, users can't query it."

The traditional posture vs the inversion

Posture Default state Cost
Open by default New tables queryable by anyone with general data-warehouse access Sensitive data leaks before anyone notices; audits are reactive; reviewer toil happens after exposure
Default-closed allowlist New tables inaccessible until reviewed Reviewer toil happens before any access; classifier needs to be fast + good; self-serve workflow is mandatory or it becomes operationally hostile

The Town Lake post is explicit that the trade is real but manageable: "This sounds painful, and it would be, except for two things. First, it's automated... Second, the workflow is self-serve."

Two affordances make default-closed tractable

1. Automated classification with high recall

A PII / data-classification scanner like Skimmer needs to "catch obvious PII (emails, IPs, names, phone numbers) and the long tail of non-obvious sensitive data (API tokens that match certain prefixes, opaque IDs that can be traced back to users)."

The Cloudflare implementation uses a two-pass classifier — fast per-column first pass, then an agentic second pass with full table context that can query the data directly to verify. This is the recall-vs-cost trade the platform makes: per-column classification is cheap; agentic verification is expensive but only runs for flagged columns.

2. Self-serve permission requests

"If you query a table you don't have access to, the error message is not 'permission denied.' It's 'this table needs review, click here to request one.' Skipper, the AI agent, will even suggest the right RBAC group to request and link you straight to it."

Without this, default-closed becomes operationally hostile — users hit denials and file tickets. The error-message-as-self- serve-request shape converts denials into a workflow, not a wall. Canonicalised at patterns/error-message-as-self-serve-permission-request.

Schema discovery is decoupled from data access

A subtle but load-bearing affordance: users can see what tables exist even before the tables are reviewed.

"We separate schema discovery from data access. Users can see what tables exist, but unreviewed columns are hidden from DESCRIBE and SHOW COLUMNS and from SELECT *. That subtle distinction matters: it means a new unreviewed column doesn't break existing dashboards built on the rest of an approved table."

Canonicalised at concepts/schema-discovery-vs-data-access-separation. Without this property, every schema migration would break dashboards until review caught up — which would force teams to either delay schema migrations (slow) or skip the review (defeats the governance posture).

Sibling concepts

Seen in

Last updated · 542 distilled / 1,571 read