Skip to content

SYSTEM Cited by 1 source

Lakebase MCP

Lakebase MCP is an open-source Model Context Protocol server exposing 46 tools to MCP-capable AI agents (Claude / Copilot / GPT) for operating on Lakebase Postgres databases. It is deployed as a Databricks App and governed by the same Unity Catalog grants and audit trail as the underlying Lakebase databases. Repository: github.com/suryasai87/lakebase-mcp (authored by the Thoughtworks team behind the Backstage-with-Lakebase series).

Stub page. First wiki source surfaces Lakebase MCP as what the DBA does on top of the platform in the Lakebase governance story — specifically the operator-intent-driven counterpart to LakebaseOps's automated platform-side operations. "The DBA stops opening pgAdmin and starts describing intent." (Source: sources/2026-05-15-databricks-backstage-with-lakebase-part-2)

Two design choices that keep agent-DB access safe

Verbatim from the 2026-05-15 source: "Two design choices keep this safe. First, dual-layer governance: a SQL-statement guard and a per-tool access guard, with four pre-built profiles (read_only, analyst, developer, admin) that map onto the same UC access patterns shown above. A coding assistant runs as read_only and physically cannot drop a table. Second, every query is attributable — the server tags every statement with the originating tool."

Layer 1: SQL-statement guard

A statement-level guard inspects every SQL statement before it reaches Lakebase, rejecting (or rewriting) statements that violate the active profile's allowed-statement-class policy. Mechanism not disclosed at the SQL-class level (e.g. AST-based vs regex-based vs SQL-grammar-based parsing) but the contract is explicit: "physically cannot drop a table" under read_only. Canonical half of patterns/dual-layer-governance-sql-and-tool-guards.

Layer 2: Per-tool access guard

A tool-level guard inspects which of the 46 MCP tools the calling agent is allowed to invoke under the active profile. The "physically cannot" property is layered: the SQL-statement guard catches what the per-tool guard's coverage misses, and vice versa. Canonical other half of patterns/dual-layer-governance-sql-and-tool-guards.

Four pre-built access profiles

Profile Use Example
read_only Coding assistants Cannot drop a table
analyst Read + analyse, no DDL or DML Reporting / debugging
developer Schema changes on dev branches Branch authoring
admin Full surface DBA-on-top-of-the-platform

Each profile "maps onto the same UC access patterns shown [earlier in the post]" — i.e. the four profiles aren't a separate permission system; they're a four-tier preset over the same UC GRANT model that governs the Lakebase database.

Per-statement tool-tag attribution

Verbatim: "every query is attributable — the server tags every statement with the originating tool." The mechanism (where the tag lives in the SQL stream — SET application_name? a structured comment? a UC-level tag separate from the SQL?) is not disclosed, but the composition claim is: "Combined with the branch-level cost attribution shown earlier, you can answer 'which agent on which branch generated the 4 AM CPU spike?' in one SQL query." The structural property: agent identity propagates to every SQL statement as a server-side tag, joining the same system.access.audit + UC system billing substrate that humans + branches already populate. Canonical instance of patterns/tool-tagged-query-attribution.

What it composes with

  • LakebaseOps: the platform-side leg of the role-shift pair. "LakebaseOps runs for the team. Lakebase MCP runs with the team. Both inherit the governance posture."
  • M2M OAuth: the agent-side authentication primitive that Lakebase requires (and the Part 1 source canonicalised); MCP server-side tool-tag attribution composes onto OAuth-identity-tagged audit + billing.
  • Branch- propagated masking: an MCP-driven agent never sees unmasked production data because the masking propagates at branch creation time, independent of the agent's access profile.

What's not disclosed

  • Wire-protocol-level mechanism for SQL-statement guard (AST parser? regex match? Postgres-grammar wrapper?).
  • Where the tool tag lives in the SQL stream (SET application_name / structured comment / out-of-band UC tag?).
  • How tool-tag attribution survives connection pooling / batch statements / multi-statement transactions.
  • The full enumeration of 46 tools (DDL? DML? branch lifecycle? query-plan inspection?).
  • Per-profile latency overhead vs unguarded direct-Postgres access.
  • Behaviour under profile-policy update mid-session.
  • How the SQL-statement guard interacts with the per-tool access guard at the layered-defence boundary (shadow-mode? fail-closed? policy precedence?).

Seen in

  • sources/2026-05-15-databricks-backstage-with-lakebase-part-2First canonical wiki disclosure of Lakebase MCP. Open-source 46-tool MCP server for AI-agent access to Lakebase Postgres, deployed as a Databricks App, with dual-layer governance (SQL-statement guard + per-tool access guard) and four pre-built profiles (read_only / analyst / developer / admin) mapping onto the same UC GRANT model. "A coding assistant runs as read_only and physically cannot drop a table." Per-statement tool-tag attribution makes "which agent on which branch generated the 4 AM CPU spike?" a one-SQL query against system.access.audit + system billing tables. Authored by Thoughtworks (suryasai87). Canonical instances of patterns/dual-layer-governance-sql-and-tool-guards + patterns/tool-tagged-query-attribution.
Last updated · 542 distilled / 1,571 read