Skip to content

PINTEREST Tier 2

Read original ↗

Pinterest — Building an MCP Ecosystem at Pinterest

Tan Wang (Software Engineer, Agent Foundations) publishes Pinterest's first-party retrospective on one year of Model Context Protocol adoption — from experimentation to a production ecosystem with 66,000 invocations/month across 844 MAUs, saving an estimated 7,000 engineer-hours/month. Pinterest's opinionated choices are the substance: hosted-not-local MCP servers, multiple small domain-specific servers (not one monolithic server), a central MCP registry as source of truth for production-approval + discovery, layered JWT + SPIFFE auth with a @authorize_tool decorator for per-tool policy, business-group-based access gating on sensitive servers (Presto / Ads), and elicitation-gated human-in-the-loop for mutating actions. The post also quietly names Pinterest's paved-path deployment pipeline — teams define tools; the platform handles deployment + scaling.

Summary

Pinterest's MCP deployment is a governance-first retrospective disguised as an architecture post. Five architectural choices dominate, each explicitly justified:

  1. Hosted over local"Although MCP supports local servers (running on your laptop or personal cloud development box, communicating over stdio), we explicitly optimized for internal cloud-hosted MCP servers, where our internal routing and security logic can best be applied." Local remains possible for experimentation; the paved path is cloud-deployed. Sibling to Fly.io's local-MCP-risk framing but arriving from the enterprise-ops side — centralised logging, auth, governance demand central execution.

  2. Multiple small servers over one monolith"We debated a single monolithic MCP server vs. multiple domain-specific servers. We chose the latter: multiple MCP servers (e.g., Presto, Spark, Airflow) each own a small, coherent set of tools. This lets us apply different access controls per server and avoid crowding the model's context." Per-server access control + context-window hygiene are named jointly as reasons — the same tool-surface-minimization reasoning that drives Cloudflare's Code Mode and Dropbox's unified-retrieval-tool, applied at the server-decomposition axis.

  3. Unified deployment pipeline as the paved path"a common piece of feedback we received early on was that spinning up a new MCP server required too much work: deployment pipelines, service configuration, and operational setup before writing any business logic. To address this, we created a unified deployment pipeline that handles infrastructure for all MCP servers: teams define their tools and the platform handles deployment and scaling of their service." Canonical new patterns/unified-mcp-deployment-pipeline pattern — an internal platform-engineering investment that collapses the MCP-server authoring loop to business-logic-only.

  4. Central MCP registry as source of truth"The MCP registry is the source of truth for which MCP servers are approved and how to connect to them." Two surfaces: a Web UI for humans (discovery + owning team + support channels + security posture + live status + visible tools) and an API for AI clients (validate servers + "Is this user allowed to use server X?" pre-flight authorization). "This is also the backbone for governance: only servers registered here count as approved for use in production." Canonical new concepts/mcp-registry concept and the substrate for Pinterest's MCP ecosystem pattern.

  5. Layered JWT + SPIFFE auth"almost every MCP call is governed by two layers of auth: end-user JWTs and mesh identities." The post explicitly contrasts this with the MCP OAuth spec's per-server consent-screen flow: "users already authenticate against our internal auth stack when they open a surface like the AI chat interface, so we piggyback on that existing session. There is no additional login prompt or consent dialog when a user invokes an MCP tool." Envoy validates JWTs, maps them to X-Forwarded-User / X-Forwarded-Groups / related headers, enforces coarse policies; inside the server, a @authorize_tool(policy='…') decorator enforces fine-grained per-tool rules. Service-only flows fall back to SPIFFE-based mesh identity for low-risk read-only cases. Canonical new patterns/layered-jwt-plus-mesh-auth pattern.

Three sub-architectural substrates complete the picture: (a) integrations into the LLM web chat interface ("used by the majority of Pinterest employees daily"), AI bots on the internal chat platform (with per-channel tool restrictions — "Spark MCP tools are only available in Airflow support channels"), and IDE plugins, each handling OAuth transparently via the registry API; (b) business-group-based access gating on sensitive servers — "even though the Presto MCP server is technically reachable from broad surfaces like our LLM web chat interface, only a specific set of approved business groups (for example, Ads, Finance, or specific infra teams) can establish a session and run the higher-privilege tools" — canonical new concepts/business-group-authorization-gating concept; (c) human-in-the-loop + MCP elicitation for mutating actions — "agents propose actions using MCP tools, and humans approve or reject (optionally in batches) before execution. We also use elicitation to confirm dangerous actions." Same elicitation gate Cloudflare's Agent Lee operationalises via Durable Objects, here formalised in agent-guidance + the MCP protocol primitive.

Observability + success metric: "All MCP servers at Pinterest use a set of library functions that provide logging for inputs/outputs, invocation counts, exception tracing, and other telemetry for impact analysis out of the box." Ecosystem-level metrics: number of MCP servers, number of tools, invocation count, estimated time-savings per invocation (owner-supplied metadata). Roll-up: time saved as north-star. "As of January 2025, MCP servers have ramped up to 66,000 invocations per month across 844 monthly active users. Using these estimates, MCP tools are saving on the order of 7,000 hours per month."

Security-review process: "Every MCP server that is not a one-off experiment must be tied to an owning team, appear in the internal MCP registry, and go through review, yielding Security, Legal/Privacy, and (where applicable) GenAI review tickets that must be approved before production use." The review output determines per-server access policies — which user groups can reach the server — enforced in Envoy.

Key takeaways

  1. The paved path is "cloud-hosted MCP server in a company-standard deployment pipeline" — not stdio-on-laptop. "Although MCP supports local servers (running on your laptop or personal cloud development box, communicating over stdio), we explicitly optimized for internal cloud-hosted MCP servers, where our internal routing and security logic can best be applied. Local MCP servers are still possible for experimentation, but the paved path is 'write a server, deploy it to our cloud compute environment, list it in the registry.'" Canonical wiki statement of the enterprise-ops inversion of Fly.io's local-MCP-risk framing: local is for experimentation, cloud-hosted is for production, and the delta is central execution → central auth + logging + policy surface. First wiki datapoint on concepts/hosted-vs-local-mcp-server as a deliberate architectural choice axis. (Source: sources/2026-03-19-pinterest-building-an-mcp-ecosystem-at-pinterest)

  2. Many small domain-specific MCP servers beat one monolithic server — per-server access control + context-window hygiene are the named reasons. "We chose the latter: multiple MCP servers (e.g., Presto, Spark, Airflow) each own a small, coherent set of tools. This lets us apply different access controls per server and avoid crowding the model's context." Canonical wiki instance applying patterns/tool-surface-minimization at the server-decomposition axis — one tool-list per domain, not one mega-list spanning domains. Pairs with the Pinterest-internal registry's pre-flight authorization API to enforce per-user visibility: an agent sees only the MCP servers its user is authorised to use. (Source: sources/2026-03-19-pinterest-building-an-mcp-ecosystem-at-pinterest)

  3. A unified MCP deployment pipeline collapses the authoring loop to business-logic-only. "To address this, we created a unified deployment pipeline that handles infrastructure for all MCP servers: teams define their tools and the platform handles deployment and scaling of their service. This lets domain experts focus on their business logic rather than figuring out deployment mechanics." Canonical new patterns/unified-mcp-deployment-pipeline — the specific form of platform-engineering investment for the MCP-server authoring surface, sibling to Cloudflare's Code Mode code-generation pipeline and Meta's unified capacity-efficiency-platform MCP tooling layer. The ergonomic pain point named explicitly: "spinning up a new MCP server required too much work: deployment pipelines, service configuration, and operational setup before writing any business logic." (Source: sources/2026-03-19-pinterest-building-an-mcp-ecosystem-at-pinterest)

  4. The MCP registry is the source of truth — "only servers registered here count as approved for use in production." "The MCP registry is the source of truth for which MCP servers are approved and how to connect to them. It serves two audiences. The web UI lets humans discover servers, the owning team, corresponding support channels, and security posture. The Web UI also shows the MCP server's live status and visible tools. The API lets AI clients (e.g., our internal AI chat platform, AI agents on our internal communications platform, IDE integrations) discover and validate servers, and lets internal services ask 'Is this user allowed to use server X?' before letting an agent call into it." Canonical new concepts/mcp-registry concept with two axes: (a) a discovery + metadata surface for humans, (b) a pre-flight authorization + validation surface for AI clients. Sibling to MCP Server Card (pre-connect per-server static JSON) and API Catalog (RFC 9727) at the organisation-internal-catalog altitude: the registry is the production approval gate — if you're not in the registry, you're not approved. (Source: sources/2026-03-19-pinterest-building-an-mcp-ecosystem-at-pinterest)

  5. Two-layer auth — end-user JWTs at Envoy + SPIFFE mesh identity — replaces the MCP OAuth per-server consent flow. "At runtime, almost every MCP call is governed by two layers of auth: end-user JWTs and mesh identities." End-user flow: OAuth to Pinterest's internal auth stack → JWT → client connects to registry + target MCP server with JWT → Envoy validates JWT, maps to X-Forwarded-User + X-Forwarded-Groups + related headers, enforces coarse policies ("AI chat webapp in prod may talk to the Presto MCP server, but not to experimental MCP servers in dev namespaces") → fine-grained @authorize_tool(policy='…') decorator enforcement inside the server. Service-only flow (low-risk / no end user): SPIFFE mesh identity alone, authorization on calling service's mesh identity. Contrast explicitly drawn with the MCP OAuth spec: "users already authenticate against our internal auth stack when they open a surface like the AI chat interface, so we piggyback on that existing session. There is no additional login prompt or consent dialog when a user invokes an MCP tool. Envoy and our policy decorators handle authorization transparently in the background, giving us fine-grained control over who can call which tools without surfacing the complexity of per-server authorization flows to the end user." Canonical new patterns/layered-jwt-plus-mesh-auth — the enterprise-integrated-SSO pattern applied to MCP, structurally different from protocol-native per-server OAuth. (Source: sources/2026-03-19-pinterest-building-an-mcp-ecosystem-at-pinterest)

  6. Per-tool authorization via a @authorize_tool(policy='…') decorator. "Inside the server, tools use a lightweight @authorize_tool(policy='…') decorator to enforce finer-grained rules (for example, only Ads-eng groups can call a get_revenue_metrics, even if the server itself is reachable from other orgs)." Canonical new patterns/per-tool-authorization-decorator pattern — the mesh-level coarse policy (Envoy JWT + header-mapped groups) covers server-reachability; the in-process decorator covers individual-tool authorization. Two-altitude authorization: server (who can connect) + tool (who can call which operation once connected). Sibling to Lambda authorizer at the API-Gateway tier and patterns/jwt-tenant-claim-extraction at the tenant-isolation altitude, but specialised to MCP tool methods. (Source: sources/2026-03-19-pinterest-building-an-mcp-ecosystem-at-pinterest)

  7. Business-group-based access gating narrows the effective user population on sensitive servers. "Since some MCP servers can execute queries against sensitive internal data systems (like the Presto MCP server), we implemented business-group-based access gating. Rather than granting access to all authenticated Pinterest employees and contractors, some servers will: (1) Extract business group membership from the user's JWT token (2) Validate that the user belongs to an authorized group before accepting the connection (the list of approved groups is set during the initial review stage) (3) Selectively enable capabilities only for users whose roles require data access." "At Pinterest, this means that even though the Presto MCP server is technically reachable from broad surfaces like our LLM web chat interface, only a specific set of approved business groups (for example, Ads, Finance, or specific infra teams) can establish a session and run the higher-privilege tools. Turning on a powerful, data-heavy MCP server in a popular surface therefore doesn't silently expand who can see sensitive data." Canonical new concepts/business-group-authorization-gating concept — solves the "widely-reachable surface × data-sensitive server" blast-radius problem without moving the server off the popular surface. (Source: sources/2026-03-19-pinterest-building-an-mcp-ecosystem-at-pinterest)

  8. Pre-connect authorization for tool discovery on some servers gives user-level attribution "for every invocation". "Some servers require a valid JWT even for tool discovery. That gives us user-level attribution for every invocation and a clean way to reason about 'who did what' when we look at logs." Non-obvious architectural move: the default MCP list_tools call is often unauthenticated; Pinterest requires authentication even there for servers with attribution requirements, so there is no "anonymous server discovery" phase for sensitive tools. Complements the MCP Server Card draft standard — that standard enables agent-side pre-connect discovery without an auth round-trip; Pinterest goes the opposite direction for sensitive servers, requiring auth even for discovery. Both are valid — the choice depends on whether you prioritise autonomous-agent ergonomics or user-level audit. (Source: sources/2026-03-19-pinterest-building-an-mcp-ecosystem-at-pinterest)

  9. Human-in-the-loop for mutating / expensive actions — agents propose, humans approve (optionally in batches), elicitation confirms "dangerous actions". "Because MCP servers enable automated actions, the blast radius is larger than if a human manually wielded these tools. Our agent guidance therefore mandates human-in-the-loop before any sensitive or expensive action: agents propose actions using MCP tools, and humans approve or reject (optionally in batches) before execution. We also use elicitation to confirm dangerous actions. In practice, this looks like our AI agents asking for confirmation before applying a change to e.g. overwrite data in a table." Same structural concept as Cloudflare Agent Lee's Durable-Object elicitation gate (2026-04-15), different substrate: Pinterest implements HITL via agent guidance + the MCP elicitation primitive rather than via a proxy Durable Object. Names batch approval as a legitimate HITL-cost-reduction mechanism — humans approve N proposed actions at once rather than N serial gate-crossings. (Source: sources/2026-03-19-pinterest-building-an-mcp-ecosystem-at-pinterest)

  10. Observability is built-in from day one; the ecosystem-level north-star is "time saved". "All MCP servers at Pinterest use a set of library functions that provide logging for inputs/outputs, invocation counts, exception tracing, and other telemetry for impact analysis out of the box." Ecosystem-level metrics: server count, tool count, invocation count, owner-supplied estimated-minutes-saved-per-invocation metadata. Roll-up: time saved as the primary success metric, with 66,000 invocations/month × owner-supplied time-saved estimates ≈ 7,000 hours/month saved as of January 2025. Canonical wiki instance of owner-metadata-driven value attribution for a platform where each server owner quantifies their own tool's value — ops-heavy alternative to after-the-fact invocation-value estimation. (Source: sources/2026-03-19-pinterest-building-an-mcp-ecosystem-at-pinterest)

  11. The MCP Security Standard is a named internal document that every non-experimental MCP server must satisfy. "We defined a dedicated MCP Security Standard. Every MCP server that is not a one-off experiment must be tied to an owning team, appear in the internal MCP registry, and go through review, yielding Security, Legal/Privacy, and (where applicable) GenAI review tickets that must be approved before production use. This set of reviews determines the security policies that are put in place around the MCP server, such as which user groups to limit access of the server to." The policy artefact — not just the mesh-level enforcement — is what allowlists which user groups can reach which server; policy is authored during review and deployed to Envoy. Canonical wiki instance of security-review-as-per-server-policy-generation, applicable beyond MCP to any platform-level service-deployment review pipeline. (Source: sources/2026-03-19-pinterest-building-an-mcp-ecosystem-at-pinterest)

  12. Three named high-leverage servers seeded the ecosystem: Presto, Spark, Knowledge. Presto MCP is "consistently our highest-traffic MCP server. Presto tools let agents (including AI-enabled IDEs) pull Presto-backed data on demand so agents can bring data directly into their workflows instead of context-switching into dashboards." Spark MCP "underpins our AI Spark debugging experience, used to diagnose Spark job failures, summarize logs, and help record structured root-cause analyses, turning noisy operational threads into reusable knowledge." Knowledge MCP is "a general-purpose knowledge endpoint (used by our internal AI bot for company knowledge and Q&A and other agents to answer documentation and debugging questions across internal sources)." Architectural lesson: the ecosystem started with a small set of high-leverage servers that solved real pain points, then other teams built on top — a canonical seed-then-proliferate shape for platform adoption. (Source: sources/2026-03-19-pinterest-building-an-mcp-ecosystem-at-pinterest)

Architectural numbers

Datum Value Scope
MCP invocations per month 66,000 All MCP servers, as of Jan 2025
Monthly active users 844 Employees invoking MCP tools
Estimated time saved per month ~7,000 engineer-hours Owner-supplied minutes-saved × invocations
Named seed MCP servers Presto, Spark, Knowledge Highest-leverage initial set
Named domain MCP servers (examples) Presto, Spark, Airflow Small, coherent tool sets per server
Auth layers per MCP call 2 End-user JWT + mesh identity (SPIFFE)
Auth layer 1 (transport) JWT validated by Envoy Coarse-grained: who can reach server
Auth layer 2 (fine-grained) @authorize_tool(policy='…') decorator Per-tool policy, in-process
Service-only flow SPIFFE mesh identity Low-risk read-only scenarios
Registry surfaces Web UI (humans) + API (AI clients) Discovery + pre-flight authorization
Review gates before production Security, Legal/Privacy, GenAI (where applicable) Per-server mandatory approval
HITL approval shape Per-action or per-batch Agent guidance mandates for sensitive/expensive actions
Elicitation use Confirm dangerous actions (e.g. table overwrite) MCP protocol primitive

Systems introduced

  • systems/pinterest-mcp-registry — Pinterest's central MCP registry, the source of truth for which MCP servers are approved for production. Web UI for human discovery (owning team + support channels + security posture + live status + visible tools) + API for AI clients (discover, validate, pre-flight authorize: "Is this user allowed to use server X?"). Governance backbone — if not in the registry, not approved.
  • systems/pinterest-presto-mcp-server — Pinterest's highest-traffic MCP server. Exposes Presto query tools to agents (including AI-enabled IDEs) so data can flow into agent workflows without dashboard context-switching. Subject to business-group-based access gating (Ads / Finance / specific infra teams only, despite broad surface reachability).
  • systems/pinterest-spark-mcp-server — Spark MCP server underpinning Pinterest's AI Spark debugging experience — diagnosing Spark job failures, summarising logs, recording structured root-cause analyses. Tool visibility is channel-scoped in the internal chat platform — e.g. "Spark MCP tools are only available in Airflow support channels."
  • systems/pinterest-knowledge-mcp-server — general-purpose knowledge-endpoint MCP server used by Pinterest's internal AI bot for company knowledge + Q&A + documentation + debugging questions across internal sources. The "internal search as MCP" primitive — institutional knowledge as a tool agents reach for.

Systems reused / extended

  • systems/model-context-protocol — the protocol Pinterest operationalises at enterprise scale. New Seen-in entry documenting Pinterest's hosted-first, domain-decomposed, registry-gated, layered-auth, observability-instrumented deployment. First wiki datum for the MCP-OAuth-spec-rejected shape — Pinterest piggybacks on existing enterprise SSO rather than per-server OAuth consent screens.
  • systems/envoy — Pinterest's mesh data-plane is where end-user JWTs are validated and mapped to X-Forwarded-User / X-Forwarded-Groups / related headers. Coarse-grained policy enforcement ("AI chat webapp in prod may talk to the Presto MCP server, but not to experimental MCP servers in dev namespaces") runs here. New canonical wiki instance of Envoy-as-auth-enforcement-point for AI-agent traffic.

Concepts extracted

  • concepts/mcp-registry — organisation-internal authoritative catalog of approved MCP servers, with dual human/agent surfaces and pre-flight authorization API.
  • concepts/hosted-vs-local-mcp-server — deliberate architectural-choice axis between stdio-on-laptop and cloud-hosted HTTP/SSE MCP servers; Pinterest's "paved path" statement of hosted-first for production.
  • concepts/business-group-authorization-gating — narrowing the authenticated-user population at session-establishment time by business-group membership claims from the JWT, so that widely-reachable surfaces don't silently expose data-heavy servers.
  • concepts/elicitation-gate (extended) — Pinterest's MCP-primitive-based implementation contrasts with Cloudflare Agent Lee's Durable-Object-based implementation.

Patterns extracted

  • patterns/hosted-mcp-ecosystem — the overall shape: central registry + paved-path deployment + domain-decomposed servers + layered auth + owner-supplied time-saved metadata + human-in-the-loop.
  • patterns/layered-jwt-plus-mesh-auth — two-layer authorization for AI-agent traffic: end-user JWT validated + header-mapped at mesh ingress, mesh identity for service-only flows, optional in-process per-tool decorator.
  • patterns/unified-mcp-deployment-pipeline — platform-engineering investment in a shared deployment pipeline so that authoring an MCP server is business-logic-only.
  • patterns/per-tool-authorization-decorator@authorize_tool(policy='…') in-process decorator for fine-grained per-tool authorization layered over coarse transport-level auth.
  • patterns/central-proxy-choke-point (extended) — Pinterest's registry + Envoy together form an AI-traffic choke point for MCP calls, structurally similar to Cloudflare's AI Gateway + Databricks' Unity AI Gateway but focused on internal employee traffic.
  • patterns/credentialed-proxy-sandbox (extended sibling) — Agent Lee's Durable-Object-based version is the per-request-code-inspection variant; Pinterest's Envoy + decorator is the per-request-header-policy variant.

Caveats recorded

  • No per-server scale numbers. The 66,000 invocations/month + 844 MAUs + 7,000 hours-saved are ecosystem-aggregate; per-server breakdowns (Presto vs Spark vs Knowledge invocation shares) are not disclosed.
  • No absolute count of registered servers. Named examples (Presto, Spark, Airflow, Knowledge) and category ("high-leverage seed set") are given; the ecosystem-wide server count is absent.
  • Time-saved estimates are owner-attested, not independently measured. "owners provide a directional 'minutes saved per invocation' estimate (based on lightweight user feedback and comparison to the prior manual workflow). Combined with invocation counts, we get an order-of-magnitude view of impact, which we treat as a directional signal of value." Pinterest is transparent about this being directional rather than precise.
  • Deployment-pipeline internals not described. The unified deployment pipeline is named as a design choice with its effect disclosed (authoring is business-logic-only) but internals (templating? Helm? company-internal PaaS on Kubernetes?) are out of scope.
  • Envoy + registry implementation details not fully specified. Envoy is named as the JWT-validation substrate; the registry-to-Envoy policy-delivery path is not described — is it config-sync? xDS? admin API?
  • @authorize_tool implementation is abstract. Pseudocode-level disclosure; language / SDK / policy-engine integration not specified (likely Python given Pinterest's ML/data stack but not stated).
  • MCP server transport not explicitly named. Pinterest names HTTP-based hosted servers as the paved path but doesn't say Streamable HTTP vs HTTP+SSE vs Pinterest's own transport. Given the 2025 date and the MCP spec trajectory, Streamable HTTP is the likely default.
  • Business-group claim origin not disclosed. The JWT custom claim for business-group membership presumably comes from Pinterest's identity provider, but the provenance + refresh semantics aren't described.
  • No incident retrospectives. This is a first-year retrospective voice, not a post-incident post — no production-incident examples of the auth system detecting / preventing / missing an attack.
  • Experiment / local MCP governance not specified. The post carves out experimental-local servers from the registry-must-be-listed requirement; what prevents production traffic from reaching a "one-off experiment" local server is not explicitly stated.
  • Future-work axis not enumerated. Closing paragraph gestures at "continue to expand the fleet of MCP servers, deepen integrations across more engineering surfaces, and refine our governance models" without specific roadmap items.

Source

Last updated · 319 distilled / 1,201 read