PATTERN Cited by 1 source

Tool-tagged query attribution¶

Pattern¶

When an MCP server (or equivalent agent-facing broker) brokers AI-agent access to a database, tag every SQL statement at the broker layer with the identity of the originating tool before forwarding it to the database. The tag joins the platform's audit substrate and billing substrate alongside the OAuth user identity and branch identity, making "which agent on which branch generated the 4 AM CPU spike?" a one-SQL query rather than an opaque blob of traffic from a single MCP service principal.

Canonical instance¶

Lakebase MCP in Backstage with Lakebase, Part 2: Governance (Thoughtworks, 2026-05-15).

Verbatim from the source: "every query is attributable — the server tags every statement with the originating tool. … Combined with the branch-level cost attribution shown earlier, you can answer 'which agent on which branch generated the 4 AM CPU spike?' in one SQL query."

The pattern composes onto the same (project_id, branch_id, endpoint_id) identity space that concepts/branch-level-cost-attribution and concepts/operational-analytical-governance-unification use. After tagging, the audit + billing substrate carries:

(user_oauth_identity, project_id, branch_id, endpoint_id, tool_tag, sql_text)

Each row is queryable directly against system.access.audit + Unity Catalog system billing tables.

Why this matters for AI-agent DB access¶

Without tool-tag attribution, an MCP server is an identity opacity surface: every statement reaches the database under the server's own service principal, so the audit log shows traffic from the MCP server but not which agent ran which tool. This is the AI-agent-DB-access version of the long-standing "shared service account" anti-pattern.

The tagging step decouples server identity from agent + tool identity: the database accepts the connection as the MCP server's service principal (necessary for connection pooling and shared infra), but the audit substrate sees the originating tool through the per-statement tag. The structural property: agent identity propagates as data on the statement, even though the connection identity is shared.

What it composes with¶

patterns/dual-layer-governance-sql-and-tool-guards: the dual-layer pattern's enforcement leg. After the SQL guard
per-tool guard have decided to let a statement through, this pattern is the observability leg that records what got through. Without tagging, the dual-layer pattern enforces but doesn't observe.
concepts/branch-level-cost-attribution: the cost-side composition. Once the tool tag joins (project_id, branch_id, endpoint_id) in the audit and billing substrates, "which agent on which branch generated the 4 AM CPU spike?" becomes a single SQL query. Without tagging, agent cost is attributable to the MCP service principal — useful but coarse.
[[patterns/foreign-catalog-federation-for-operational-db- governance]]: the broader substrate this pattern's tags land in. The foreign-catalog federation makes the operational DB's audit log live in UC; tool-tag attribution makes the audit log carry agent-resolution.

When it fits¶

An MCP server or equivalent broker layer mediates AI-agent access to a database (or any other auditable resource).
The audit/billing substrate carries free-form attribution fields the broker can stamp (e.g., a tag column, a structured comment, an application_name-equivalent that survives to the audit log).
Operations teams need agent-resolution for cost or audit queries — typically when multiple agents share an MCP profile, or when AI-driven traffic is a non-trivial share of overall database load.

When it doesn't fit¶

Direct agent-to-database access without a broker. There's no tagging point if the MCP layer isn't there; the database has to extract agent identity from the connection itself, which typically isn't possible.
The audit substrate doesn't preserve per-statement metadata through to the queryable log. Some Postgres deployments truncate application metadata at logging; the pattern's value drops if the tag is lost between broker and audit table.
Single-agent deployments. If the MCP server brokers exactly one agent's traffic, the tool tag adds operational overhead with no attribution payoff — the agent identity is already implied by the broker identity.

Anti-patterns¶

Tag at the connection level only. Setting application_name once per connection means all statements in a pooled connection inherit the same tag, so multi-tool agents appear as a single tool to the audit log. The pattern requires per-statement tagging.
Tag in a place audit doesn't see. Adding a tag to a client-side log without ensuring it reaches the server-side audit substrate gives the appearance of attribution while the audit substrate is still opaque. The tag has to land in system.access.audit (or equivalent) to satisfy the "one SQL query" claim.
Trust the agent to self-tag. The tagging point must be the broker, not the agent — a malicious or misconfigured agent spoofs another tool's tag to evade attribution. The MCP server is the trusted attribution boundary precisely because it sees which tool was actually invoked.

Caveats¶

The 2026-05-15 source asserts the per-statement tagging capability but does not disclose: where the tag lives in the SQL stream (SET application_name? structured SQL comment? out-of-band metadata to the audit ingest path?), how it interacts with connection pooling, whether it survives the SQL-statement-guard rewriting layer, or how multi-statement transactions carry the tag across statements.
The composability claim with branch-level cost attribution ("which agent on which branch generated the 4 AM CPU spike") is mechanically plausible if the tag reaches both system.access.audit and the system billing tables, but the exact join schema is not disclosed.

Seen in¶

sources/2026-05-15-databricks-backstage-with-lakebase-part-2 — First canonical wiki disclosure of the tool-tagged-query- attribution pattern. Lakebase MCP tags every SQL statement with the originating tool before forwarding it to Lakebase Postgres. Composes with branch-level cost attribution + UC's unified audit substrate so "which agent on which branch generated the 4 AM CPU spike?" becomes a one-SQL query against system.access.audit joined to UC system billing tables.