Skip to content

Pinterest

Pinterest Engineering is a Tier-2 source on the sysdesign-wiki. Pinterest operates a visual- discovery + recommendation platform at hyperscale; the engineering blog's most substantive architectural posts cluster around storage (HBase → TiDB / KV stores / Goku), data-processing platforms (Moka on Yunikorn), quota + governance infrastructure (Piqama), graph services (Zen), indexed datastores (Ixia), ads ranking / ML serving (unified multi-surface engagement models, L1 CVR online-offline debugging, MMoE + long-sequence Transformers), Home Feed multi-objective optimization / diversification (DPP → SSD → unified soft-spacing with PinCLIP + Semantic ID signals, hosted on PyTorch on company-wide model serving cluster), production Text-to-SQL / Analytics Agent on top of PinCat governance catalog + unified context-intent embeddings over query history + internal Vector DB service on OpenSearch, and ML platform. Historical operator of one of the largest HBase deployments in the world before the 2021 deprecation decision.

Key systems

URL normalisation — MIQPS + long-tail parameter learning (2026-04-20 MIQPS post)

  • systems/pinterest-miqpsMinimal Important Query Param Set. Per-domain, per-query-parameter-pattern algorithm + offline job + published config artefact classifying each URL parameter as neutral (safe to strip) or non-neutral (preserve). Uses a visual-content-ID removal test: sample up to S URLs with distinct values for the parameter, render with + without the parameter, classify non-neutral if content IDs differ in ≥T% of samples. Early-exit optimisation stops testing once non-neutral is clear; conservative default flags under-sampled parameters as non-neutral. Anomaly-gated publish with asymmetric rules (non-neutral → neutral flips are anomalies; new non-neutral entries + disappearing patterns are not) protects against degenerate recomputes. Three-phase deployment: continuous ingest → offline compute → runtime lookup. Canonical wiki instance of patterns/per-domain-adaptive-config-learning + patterns/visual-fingerprint-based-parameter-classification + patterns/conservative-anomaly-gated-config-update + patterns/offline-compute-online-lookup-config.
  • systems/pinterest-url-normalizer — the runtime URL-processing component that loads the MIQPS map at init and does in-memory lookups per URL. Stacks four independent normalisation layers with OR semantics on keep-decisions: static platform allowlists (Shopify variants, Salesforce Commerce Cloud start / sz / prefn1 / prefv1) + regex patterns + MIQPS non-neutral set + conservative default. Parameter kept if any layer preserves it; stripped only if all layers agree. Canonical wiki instance of patterns/multi-layer-normalization-strategy.
  • systems/pinterest-content-ingestion-pipeline — Pinterest's content-acquisition pipeline from merchant domains. Dual role in MIQPS: upstream producer of per-domain URL corpus (writes each observed URL to S3 continuously as a side effect of normal processing) + downstream consumer of normalised URLs (runs fetch + render + process once per canonical URL rather than per URL variant). Framing: "rendering the same page dozens of times simply because its URLs differ in irrelevant parameters" is the cost driver URL normalisation eliminates. Canonical wiki URL-normalisation use case.

MCP ecosystem — hosted MCP servers + central registry + layered JWT/mesh auth (2026-03-19 MCP ecosystem post)

  • systems/pinterest-mcp-registry — Pinterest's central MCP registry, the source of truth for which MCP servers are approved for production. Dual-surface: Web UI for humans (owning team + support channels + security posture + live status + visible tools), API for AI clients (discover + validate + pre-flight authorize). "Only servers registered here count as approved for use in production."
  • systems/pinterest-presto-mcp-server — Pinterest's highest-traffic MCP server. Exposes Presto query tools to agents so data flows into agent workflows without dashboard context-switching. Subject to business-group-based access gating — Ads / Finance / specific infra teams only, despite broad surface reachability.
  • systems/pinterest-spark-mcp-server — underpins Pinterest's AI Spark debugging experience: diagnose Spark job failures, summarise logs, record structured RCAs. Channel-scoped tool visibility — "Spark MCP tools are only available in Airflow support channels."
  • systems/pinterest-knowledge-mcp-server — general-purpose knowledge endpoint used by Pinterest's internal AI bot for company Q&A + documentation + debugging across internal sources. Sibling shape to Dropbox's Dash MCP — unified-retrieval-tool over institutional knowledge.
  • systems/model-context-protocol — the protocol Pinterest operationalises at enterprise scale. First canonical wiki instance of the enterprise-SSO piggyback shape — Pinterest explicitly rejects the MCP OAuth spec's per-server consent flow for internal traffic.
  • systems/envoy — mesh data-plane + JWT validation + identity-header mapping point for all MCP traffic. "Envoy validates the JWT, maps it to X-Forwarded-User, X-Forwarded-Groups, and related headers, and enforces coarse-grained security policies." First wiki datum of Envoy-as-AI-agent-auth-enforcement.

Client-side performance measurement — Android BaseSurface + PerfView interfaces (2026-04-08 Performance for Everyone post)

  • systems/pinterest-base-surface — Pinterest Android's base UI class every feature screen inherits from. Since 2026 the substrate for automatic Visually Complete measurement: walks the Android view tree from the root, inspects opt-in Perf* marker interfaces, emits a User Perceived Latency timestamp when all visible content-critical views report ready. Canonical wiki instance of patterns/base-class-automatic-instrumentation. 60+ Android surfaces continuously measured with zero per-surface instrumentation cost (down from the pre-platform two engineer-weeks per surface Pinterest disclosed).
  • systems/pinterest-perf-view — three opt-in marker interfaces (PerfImageView, PerfTextView, PerfVideoView) that product engineers tag content-critical views with; expose isDrawn() / isVideoLoadStarted() plus geometry methods (x() / y() / width() / height()) so the BaseSurface view-tree walk can filter to visible views and conjoin readiness. Canonical wiki instance of patterns/opt-in-performance-interface.
  • systems/pinterest-android-app — Pinterest's native Android client; deployment target of the 2026 Visually Complete system. Named surfaces in the post: Home Feed, Search Result Feed, Video Pin Closeup, Search Auto Complete.

Home Feed multi-objective optimization (2026-04-07 MOO-evolution post)

  • systems/pinterest-home-feed-blender — Pinterest's Home Feed multi-objective optimization / blending layer. Three generations: V1 (2021, DPP in a backend node chain) → V2 (early 2025, SSD in PyTorch on company-wide model serving cluster) → V2+ (mid/late 2025, unified soft-spacing framework composed into SSD's utility equation for content-quality penalties). Canonical production instance of multi-objective reranking, SSD-over-DPP migration, backend-to-model-server infrastructure migration, and config-based soft-spacing framework.
  • systems/pinterest-home-feed — the product surface; cascaded-funnel framing now explicit (retrieval → pre-ranking → ranking → multi-objective optimization).
  • systems/pytorch — serving substrate for SSD + soft-spacing on Pinterest's model serving cluster. Canonical wiki instance of non-ML algorithmic logic riding a general model-serving substrate.
  • systems/pinclip — Pinterest's multimodal (image-text-aligned, graph-aware) foundational visual embedding; Q3 2025 replacement for prior visual signal in SSD's pairwise similarity. Near-real- time availability for recently-ingested Pins.
  • systems/graphsage — inductive graph-embedding method used for Pin similarity in both DPP (2021) and SSD (2025) eras.

Analytics Agent + PinCat + Vector DB (2026-03-06 Text-to-SQL post)

  • systems/pinterest-analytics-agentthe #1 agent at Pinterest (10× the next most-used agent, 40% analyst-population coverage in two months, target 50% by year-end). Four-layer architecture: Agent Orchestration (LLM with Pinterest-specific prompts) + MCP Integration (table search + query search + knowledge search + Presto execution)
  • Context (PinCat schemas + vector indexes + expert docs + query logs) + Execution (Presto with EXPLAIN-before-EXECUTE
  • bounded retry + default LIMIT 100). Design principles: asset-first, governance-aware ranking, schema-grounded SQL validation, conflict-resolution hierarchy (docs

    schema > query patterns > general knowledge).

  • systems/pinterest-pincat — Pinterest's internal data catalog on DataHub. System of record for table tier tags (Tier 1 / 2 / 3), owners, retention, and column-level glossary terms. The load- bearing substrate the Analytics Agent grounds every SQL query against.
  • systems/pinterest-vector-db-service — internal Vector Database as a Service on AWS OpenSearch + Hive (source of truth) + Airflow (index creation + ingestion DAGs). JSON-schema config → production vector index in days. Millions of embeddings with daily incremental updates; hybrid semantic-plus-metadata filtering. Canonical wiki instance of patterns/internal-vector-db-as-service.
  • systems/pinterest-ai-table-documentation — AI-generated table + column descriptions from lineage + existing docs + glossary terms + representative QueryBook queries. Tier-1 human-in-the-loop; Tier-2 LLM-drafts-human-reviews. Paired with join-based glossary term propagation (auto-tagged >40% of columns) + search-based propagation. ~70% total manual-documentation-work reduction.
  • systems/pinterest-querybook — Pinterest's open-source collaborative SQL editor; origin of the query history indexed by the Analytics Agent.

Ads engagement modeling (2026-03-03 unified model post)

Quota management platform (2026-02-24 Piqama post)

  • systems/pinterest-piqamageneric quota management ecosystem. REST + Thrift control plane; pluggable schema / validation / dispatch / enforcement hooks; one platform serves both capacity quotas (Moka) and rate-limit quotas (TiDB, KV Stores); feedback loop via Iceberg + Presto + auto-rightsizing service; budget integration with tier-weighted haircut on exceedance.
  • systems/pinterest-moka — next-gen Big Data Processing Platform on Apache Yunikorn; canonical Piqama capacity-quota integration; per-project Yunikorn queue fed by Piqama quota values via a Yunikorn Config Updater.
  • systems/apache-yunikorn — open-source resource scheduler underneath Moka.
  • systems/pinterest-pinconf — Pinterest's config distribution platform; canonical substrate for feature flags + dynamic service config + Piqama rate-limit-rule delivery.
  • systems/pinterest-spf — Service-Protection Framework; in-process rate-limit + throttling + concurrency-control library that makes local data-path decisions.

Storage substrate (2024-05-14 HBase deprecation post)

  • systems/hbase — Pinterest's default NoSQL store 2013-2021; peak 50 clusters / 9,000 EC2 instances / >6 PB data; deprecated for 5 named reasons (maintenance cost, missing functionality, system complexity, infra cost, waning community).
  • systems/tidb — Pinterest's chosen post-HBase NewSQL replacement for general-NoSQL workloads requiring transactions + rich query + secondary index. Also a named Piqama rate-limit integration target (2026-02-24 Piqama post).
  • systems/pinterest-kvstore — Pinterest's in-house KV store on systems/rocksdb + systems/rocksplicator; replaced HBase for KV workloads. Named Piqama rate-limit integration target.
  • systems/pinterest-zen — Pinterest's graph service (was on HBase; migration target).
  • systems/pinterest-ixia — Pinterest's indexed datastore built on HBase + Manas realtime.
  • systems/pinterest-goku — Pinterest's in-house time-series datastore; replaced HBase for time-series workloads.
  • systems/pinterest-ums — Pinterest's in-house wide-column store.

Key patterns / concepts

URL normalisation + content deduplication (2026-04-20 MIQPS post)

  • patterns/per-domain-adaptive-config-learning — hybrid head-curated + long-tail-learned configuration. Static rules for well-known platforms (Shopify, Salesforce Commerce Cloud) + empirical learning (MIQPS) for the long tail of merchant domains. Same structural shape as patterns/head-cache-plus-tail-finetuned-model (Instacart ML layer) applied at the config / rules layer.
  • patterns/visual-fingerprint-based-parameter-classification — empirical removal-test using a content-ID fingerprint as ground truth. For each parameter, sample URLs, render with/without, compare fingerprints, classify by mismatch rate. Pattern generalises to any "is this component material?" question where content fingerprinting is cheaper than understanding-the-meaning.
  • patterns/multi-layer-normalization-strategy — combine independent classifiers (static allowlist + regex + MIQPS + conservative default) with OR semantics on keep-decisions. Bias toward the tolerable failure mode: keeping a neutral parameter wastes a render (tolerable); stripping a non-neutral parameter silently merges distinct items (catastrophic). Every layer acts as an independent safety net.
  • patterns/conservative-anomaly-gated-config-update — publish-time safety gate. Compare new config against previous, count entries that changed in the "dangerous" direction only, reject update if above threshold. Asymmetric rules embody asymmetric costs. Canonical MIQPS instance: non-neutral → neutral flips are anomalies; new non-neutral entries + pattern-disappearances are fine.
  • patterns/offline-compute-online-lookup-config — three-phase architecture: continuous-ingest to durable corpus + offline batch compute → anomaly-gate → publish artefact → runtime in-memory lookup. Scaling-denominator shift: offline analysis scales with domain count while real-time would scale with URL count (orders of magnitude more expensive for Pinterest's hundreds-of-thousands-of-domains + billions-of-URLs regime).
  • concepts/url-normalization — collapsing URL variants into one canonical form before expensive downstream work (fetch / render / process / content-dedupe). Upstream of content-identity dedup, which catches duplicates only after paying render cost.
  • concepts/query-parameter-pattern — sorted set of parameter names in a URL. The grouping key for per-pattern classification, because the same parameter name can play different roles on different page types on the same domain (canonical Pinterest example: ref is neutral on a product page but non-neutral on a comparison page).
  • concepts/neutral-vs-non-neutral-parameter — the binary per-(domain, pattern, parameter) classification MIQPS assigns. Neutral = safe to strip; non-neutral = must preserve.
  • concepts/content-id-fingerprint — same-content → same-ID function over rendered pages. Pinterest uses a visual-representation hash; the algorithm is agnostic (DOM tree hashing, response body checksum, <title> + Open Graph metadata also valid).
  • concepts/canonical-url-unreliability — why <link rel="canonical"> isn't enough across the long tail of merchant domains. Three failure modes: omitted entirely / set incorrectly (homepage canonical default) / contaminated with tracking params. Canonical wiki framing of "trust the data, not the declaration" applied to URL canonicality.
  • concepts/anomaly-gated-config-update — concept framing for the discipline of comparing newly-computed config against previously-published config before allowing the update, with asymmetric rules that encode the underlying cost asymmetry.
  • concepts/offline-compute-online-lookup — concept framing for the architectural split between expensive offline analysis and cheap runtime lookup. Acceptable when the underlying phenomenon changes slowly (Pinterest: URL parameter conventions change on the order of weeks to months).

Hosted MCP ecosystem + layered auth (2026-03-19 MCP ecosystem post)

  • patterns/hosted-mcp-ecosystem — the umbrella pattern: central registry + paved-path deployment + domain-decomposed servers + layered auth + owner-supplied time-saved metadata + human-in-the-loop. Pinterest's 66,000-invocations/month / 844-MAU / ~7,000-engineer-hours-saved-per-month ecosystem.
  • patterns/layered-jwt-plus-mesh-auth — two-layer authorization for AI-agent traffic: end-user JWT validated + header-mapped at Envoy, mesh identity for service-only flows, optional in-process per-tool decorator. Canonical wiki instance of enterprise-SSO-piggyback for MCP.
  • patterns/unified-mcp-deployment-pipeline — platform-engineering investment in a shared deployment pipeline so authoring an MCP server is business-logic-only. Collapses the per-server infrastructure boilerplate (deployment, scaling, auth wiring, observability, registry listing) into platform-handled concerns.
  • patterns/per-tool-authorization-decorator@authorize_tool(policy='…') in-process decorator for fine-grained per-tool authorization, layered over coarse transport-level auth. Pinterest example: get_revenue_metrics callable only by Ads-eng groups even when the server is broadly reachable.
  • patterns/central-proxy-choke-point (extended) — Pinterest's Registry + Envoy together form an enterprise-internal MCP choke point. Differs from Cloudflare/Databricks external-LLM-API choke-points on two axes: internal-employee traffic (no upstream-credential-injection job) + policy-surface-via-registry-backed-review-outcomes (not a dashboard flag).
  • concepts/mcp-registry — organisation-internal authoritative catalog of approved MCP servers with dual human/agent surfaces and pre-flight authorization API. Sibling to MCP Server Card (per-server public) + concepts/api-catalog (HTTP-API public).
  • concepts/hosted-vs-local-mcp-server — the deliberate architectural-choice axis; Pinterest's "paved path" statement of hosted-first for production, local for experimentation.
  • concepts/business-group-authorization-gating — narrow the authenticated-user population at session-establishment time by business-group membership claims. Solves the wide-surface × data-heavy-server blast-radius problem without moving the server off the popular surface.
  • concepts/elicitation-gate (extended) — Pinterest's MCP-primitive + agent-guidance implementation contrasts with Cloudflare Agent Lee's Durable-Object-proxy implementation. Introduces batch approval as a legitimate HITL-cost-reduction mechanism.
  • systems/envoy (extended) — first wiki datum of Envoy as JWT validation + identity-header mapping for AI-agent traffic; coarse-grained policy enforcement for server-reachability.

Client-side performance platform (2026-04-08 Performance for Everyone post)

  • patterns/base-class-automatic-instrumentation — build measurement logic into the UI-framework base class every screen inherits from. Canonical wiki instance is Pinterest's Android BaseSurface. Collapses two-engineer-weeks-per-surface instrumentation cost to zero-per-surface at platform scale.
  • patterns/view-tree-walk-for-readiness-detection — iterate the UI element tree from the root, filter to visible views via geometry, conjoin per-view readiness through a uniform interface. How Pinterest's BaseSurface decides per-surface Visually Complete without per-surface code.
  • patterns/opt-in-performance-interface — product engineers tag content-critical views by implementing a small marker-plus-readiness interface (PerfImageView / PerfTextView / PerfVideoView); platform walks the tree and consumes. Solves the auto-detect-false-positives problem.
  • concepts/user-perceived-latency — time from user action until the user sees the content; Pinterest's product-level latency contract ("performance is the default feature"). Operationalised as concepts/visually-complete.
  • concepts/visually-complete — the per-surface operational predicate for User Perceived Latency. Canonical worked examples from the post: Video Pin Closeup = "full-screen video starts playing"; Home Feed = "all images rendered and videos playing"; Search Auto Complete = "search suggestions' text rendered along with avatar images."
  • concepts/client-side-performance-instrumentation — the broader category of in-app measurement; distinct from server-side observability because layout + decoding + buffering + hydration happen after the last server response.
  • concepts/instrumentation-engineering-cost — Pinterest's two-engineer-weeks-per-surface datum; the forcing function for the platform investment.
  • concepts/view-tree-traversal — walking a hierarchical UI element tree to compute a derived predicate; the substrate Pinterest's base class operates on.
  • concepts/base-class-instrumentation — inheritance-based realisation of cross-cutting instrumentation; collapses N-component cost to one-platform cost.
  • concepts/opt-in-marker-interface — interface implementation as opt-in declaration for cross-cutting framework behaviour; the language-level mechanism underneath PerfView.

Home Feed multi-objective optimization (2026-04-07 MOO-evolution post)

  • patterns/multi-objective-reranking-layer — dedicated final funnel stage for slate composition; canonical wiki instance is Pinterest Home Feed Blender.
  • patterns/ssd-over-dpp-diversification — algorithm-migration pattern: swap DPP for SSD to gain PyTorch-native implementation + lower serving latency + signal-expansion capacity.
  • patterns/blending-logic-to-model-server — infrastructure- migration pattern: move feed-blending heuristics from backend service code to PyTorch-hosted components on company-wide model serving cluster for iteration velocity, local testability, and unified feature plumbing.
  • patterns/config-based-soft-spacing-framework — declarative configuration of sensitive-content classes and soft-spacing penalties; abstract single-class implementation into a platform as quality-axes grow.
  • patterns/multi-signal-pairwise-similarity — compose visual + text + graph + Semantic-ID signals into one similarity substrate; signal-expansion from (GraphSage + taxonomy) to (PinCLIP + text + GraphSage + Semantic ID) across 2021-Q4-2025.
  • concepts/feed-diversification — slate-level reranking for topic/style variety; a long-term engagement lever. Canonical Pinterest ablation datum: >2% time-spent-impression drop week 1.
  • concepts/determinantal-point-process — DPP algorithm parametrized over relevance diagonal + similarity off-diagonal. Pinterest Home Feed V1 (2021-2024).
  • concepts/sliding-spectrum-decomposition — position-adaptive windowed spectral decomposition. Pinterest Home Feed V2 (2025 →).
  • concepts/soft-spacing-penalty — distance-weighted penalty on clustered sensitive-class content; graceful alternative to hard filtering.
  • concepts/semantic-id — hierarchical discrete content representation via coarse-to-fine quantisation; prefix-overlap penalty for stable category-like anti-clustering (Q4 2025).
  • concepts/feed-level-reranking — canonical stage-level framing of slate-composition reranking distinct from pointwise ranking. Extends retrieval → ranking funnel with a third production stage.
  • concepts/position-adaptive-diversification — diversification whose decisions condition on already-placed items (vs slate- global). SSD is the canonical production instance.
  • concepts/short-term-vs-long-term-engagement — canonical trade-off surfaced by the DPP-ablation study (day-1 engagement gain → week-2 negative retention).
  • concepts/quality-penalty-signal — classifier output flagging elevated-risk content for soft-penalty treatment (not hard filter). Consumed by concepts/soft-spacing-penalty.
  • concepts/exposure-bias-ml — closed-loop feedback mechanism behind the short-term-vs-long-term divergence: less-diverse content creates less-diverse engagement signals, training subsequent rankers on biased distributions, collapsing variety further. Extended in this post with the chronic-equilibrium variant (vs the acute A/B-window variant from the L1 CVR post).

Text-to-SQL / Analytics Agent (2026-03-06 unified context-intent embeddings post)

Ads ranking / model unification (2026-03-03 post)

Quota management (2026-02-24 Piqama post)

Storage deprecation (2024-05-14 HBase post)

Recent articles

  • 2026-04-20Smarter URL Normalization at Scale: How MIQPS Powers Content Deduplication. Pinterest Content Acquisition and Media Platform (Shanhai Liao, Di Ruan, Evan Li) introduce MIQPS (Minimal Important Query Param Set) — a per-domain, per-query-parameter-pattern algorithm that learns which URL parameters affect content identity via a visual-content-ID removal test: sample up to S URLs with distinct parameter values, render the page with and without the parameter, classify non-neutral if content IDs differ in ≥T% of samples. Motivation: Pinterest's content ingestion pipeline otherwise wastes render capacity on URL variants that resolve to the same content (tracking params, session tokens); content-identity dedup catches them downstream but only after the render cost. Static allowlists cover known platforms (Shopify variants, Salesforce Commerce Cloud start / sz / prefn1 / prefv1) but "URL parameter conventions vary wildly" across the long tail. Per-pattern (not per-parameter-name-global) keying is load-bearing — canonical example: ref is neutral on a product page URL but non-neutral on a comparison page URL. <link rel="canonical"> is unreliable across the long tail (omitted / misconfigured / contaminated). Three-phase architecture (patterns/offline-compute-online-lookup-config): continuous ingest writes per-domain URL corpus to S3 → offline job runs MIQPS + anomaly detection against previous version (asymmetric rules: non-neutral → neutral flips are anomalies; new non-neutral entries + pattern-disappearances are fine; reject publish if >A% of entries flip dangerous direction) → publish to config store + archive to S3 → runtime URL Normalizer loads map at init and does in-memory lookup using OR semantics across four layers (static allowlist + regex + MIQPS + conservative default — parameter kept if any layer votes keep). Offline-over-realtime rationale: render cost is seconds-per-page; realtime analysis scales with URL count (billions) while offline scales with domain count (hundreds of thousands); transient rendering failures are retryable offline but would block content processing in realtime; URL conventions change on weeks-to-months cadence making staleness acceptable. Asymmetric-cost reasoning threads every design choice: dropping a non-neutral parameter silently merges distinct items (catastrophic); keeping a neutral parameter wastes a render (tolerable). Early-exit stops testing once non-neutral is clear; conservative-default marks under-sampled parameters non-neutral; anomaly-detection rules treat only the dangerous flip direction as anomalous; multi-layer keeps if any layer preserves. No numerical wins disclosed (no dedup ratio, no compute savings, no latency delta); all five tunables K/S/T/N/A left abstract. First canonical URL-normalisation / content-deduplication post on the wiki — new canonical wiki instances of patterns/per-domain-adaptive-config-learning, patterns/visual-fingerprint-based-parameter-classification, patterns/multi-layer-normalization-strategy, patterns/conservative-anomaly-gated-config-update, patterns/offline-compute-online-lookup-config + concepts concepts/url-normalization, concepts/content-id-fingerprint, concepts/query-parameter-pattern, concepts/neutral-vs-non-neutral-parameter, concepts/canonical-url-unreliability, concepts/anomaly-gated-config-update, concepts/offline-compute-online-lookup. Complements request-level-deduplication post on a different axis of the deduplication-umbrella — that post is recsys-serving-compute dedup, this post is content-ingestion-compute dedup.
  • 2026-03-19Building an MCP Ecosystem at Pinterest. Pinterest Agent Foundations (Tan Wang) publishes a one-year retrospective on Pinterest's MCP ecosystem: 66,000 invocations/month across 844 MAUs, saving an estimated ~7,000 engineer-hours/month as of January 2025. Six opinionated architectural choices: (1) hosted over local — paved path is cloud-deployed, not stdio-on-laptop, so central routing + security apply; (2) many small domain-specific servers (Presto, Spark, Airflow, Knowledge) over one monolith — per-server access control + context-window hygiene; (3) unified deployment pipeline so authoring is business-logic-only; (4) central MCP registry as source of truth for approved-for-production servers (dual Web UI + AI-client-API surfaces, pre-flight authorization); (5) layered JWT + SPIFFE mesh authEnvoy validates JWT + maps to X-Forwarded-User / X-Forwarded-Groups headers for coarse-grained policy, in-process @authorize_tool(policy='…') decorator (patterns/per-tool-authorization-decorator) for fine-grained per-tool policy, SPIFFE mesh identity for service-only flows; (6) elicitation-gated HITL via MCP's elicitation primitive for mutating/expensive actions, with batch approval as HITL-cost-reduction. Three seed servers named: Presto MCP (highest-traffic, business-group gated to Ads/Finance/infra), Spark MCP (AI Spark debugging, channel-scoped to Airflow support channels), Knowledge MCP (general Q&A substrate). Three integration surfaces: internal LLM web chat, AI bots on internal chat platform (per-channel tool visibility), IDE plugins. Explicit rejection of the MCP OAuth spec's per-server consent flow for internal traffic — "users already authenticate against our internal auth stack when they open a surface like the AI chat interface, so we piggyback on that existing session." Canonical wiki instance of patterns/hosted-mcp-ecosystem + first canonical enterprise-SSO piggyback MCP shape.
  • 2026-04-08Performance for Everyone. Pinterest Android Performance Engineering (Lin Wang) retrospective on retrofitting automatic User Perceived Latency measurement across every Android surface by building Visually Complete detection into the UI base class (BaseSurface). Three opt-in marker interfaces — PerfImageView / PerfTextView / PerfVideoView — let product engineers tag content-critical views; the base class walks the view tree from the root, filters to visible Perf* instances via geometry, conjoins per-view readiness, and emits a timestamp automatically. Canonical data points: two engineer-weeks per surface hand-rolled cost → 60+ Android surfaces continuously measured with zero per-surface work; "all surfaces measured by the same standard" means fair cross-surface comparison for the first time; short-shelf-life surfaces (Christmas landing pages) previously excluded are now automatically covered. The pattern generalises: "following the success on Android, we have also extended the same concept to iOS and web platforms." Thesis: "Once the performance metrics are offered to product engineers for free, it makes Pinterest's performance more visible and encourages everyone to protect and optimize the User Perceived Latency on their surfaces." Canonical wiki instances of patterns/base-class-automatic-instrumentation, patterns/view-tree-walk-for-readiness-detection, patterns/opt-in-performance-interface. First client-side performance-platform post on the Pinterest wiki axis — complements existing server-side observability / quota / ML ranking axes with the "measurement platform" slice.
  • 2026-04-07Evolution of Multi-Objective Optimization at Pinterest Home Feed. Pinterest Homefeed + Content Quality teams retrospective on three generations of Home Feed's multi-objective optimization (MOO) / blending layer — the final funnel stage after retrieval / pre-ranking / ranking that determines feed composition rather than per-candidate engagement. V1 (2021) used DPP with GraphSage + categorical-taxonomy pairwise similarity inside a backend node chain. V2 (early 2025) replaced DPP with Sliding Spectrum Decomposition (SSD) hosted in PyTorch on Pinterest's company-wide model serving cluster — lower serving latency, numerically robust (no PSD enforcement / Cholesky failures), expandable similarity substrate (visual + text + graph + Q3-2025 PinCLIP multimodal + Q4-2025 Semantic ID prefix overlap). V2+ (mid/late 2025) added a unified soft-spacing penalty composed into SSD's utility equation for content-quality-risk classes, later abstracted into a config-based framework. Canonical production datum: removing DPP produced a >2% time-spent-impression drop within week 1, with day-1 engagement gains reversing by week 2 — canonical short-term-vs-long-term engagement trade-off and closed-loop feedback evidence. Sits alongside the ads engagement model post (upstream ranking) and L1 CVR diagnosis post (diagnosis methodology) to complete the three-axis wiki coverage of Pinterest's recommendation funnel.
  • 2026-03-06Unified Context-Intent Embeddings for Scalable Text-to-SQL. Pinterest data platform team (Keqiang Li, Bin Yang) document how the Pinterest Analytics Agent evolved from a schema-grounded RAG-based Text-to-SQL prototype into the #1 agent at Pinterest (10× the next most-used, 40% analyst-population coverage in two months, target 50% year-end). Two central engineering claims: (1) unified context-intent embeddings — index natural-language descriptions of the business question each historical SQL query was designed to answer, not table docs; the SQL-to-text step generates explicit "analytical questions this query answers" creating a question-to-question bridge that sidesteps vocabulary mismatch between user phrasing and schema phrasing. (2) Structural + statistical patterns with governance-aware ranking — extract validated join keys + filters + aggregation patterns from query history, fuse with tier + freshness + documentation + ownership signals when ranking. Post also documents the full supporting stack: PinCat (internal catalog on DataHub) as system of record for tiers + glossary terms; AI Table Documentation + join-graph + search-based glossary propagation (~70% manual documentation work reduction, >40% columns auto-tagged); internal Vector DB as a Service on AWS OpenSearch + Hive + Airflow (zero-to-production-index in days, millions of embeddings, daily incremental updates, hybrid semantic-plus-metadata filtering); four-layer Analytics Agent architecture (Orchestration + MCP + Context + Execution) with EXPLAIN-before-EXECUTE validation + bounded retry + default LIMIT 100 + column-profiling-aware filter generation. Governance roadmap: 400K → ~100K table footprint reduction; "Governance and AI reinforce each other." Thesis: "your analysts already wrote the perfect prompt"query history is the knowledge base, self-reinforcing as 2,500+ analysts continuously teach the system.
  • 2026-03-03Unifying Ads Engagement Modeling Across Pinterest Surfaces. Pinterest Ads ML (Duna Zhan, Qifei Shen, Matt Meng, Jiacheng Li, Hongda Shen) consolidate three surface-specific CTR-prediction models (Home Feed, Search, Related Pins) into a unified engagement model with shared trunk (MMoE + long-user-sequence Transformer) + surface-specific tower trees + view-type-specific calibration + multi-task heads + surface-specific checkpoint exports. Serving efficiency paired with unification: DCNv2 projection layer, fused-kernel embedding, TF32, request-level user-embedding broadcasting. Staged unification by CUDA throughput — HF + SR first (similar cost), RP deferred until efficiency work stabilised. Load-bearing claim: MMoE + long sequences only paid off when integrated into unified model with multi-surface training data.
  • 2026-02-27Bridging the Gap: Diagnosing Online-Offline Discrepancy in Pinterest's L1 Conversion Models. Pinterest Ads ML production-retrospective on why a new L1 CVR model showed 20–45% offline LogMAE reduction but neutral / negative CPA online. Introduces three-layer diagnosis framework, rules out exposure bias / timeouts / offline-eval bugs, names feature parity gap + embedding version skew as the two concrete layer-2 causes, + funnel recall ceilings as layer-3 residual.
  • 2026-02-24Piqama: Pinterest Quota Management Ecosystem. Pinterest Big Data Processing Platform + Online Systems jointly introduce Piqama, a generic quota management platform handling both capacity quotas (memory / vcore / concurrent-apps for Moka on Yunikorn) and rate-limit quotas (QPS / bandwidth for TiDB + KV Stores). Architecture: REST + Thrift control-plane portal; pluggable schema + validation (including remote-service hooks for cluster-capacity sum-checks) + authorization + dispatch + enforcement; pre-aggregated usage stats to Apache Iceberg on S3; separate auto-rightsizing service reading from Iceberg / Presto / user-defined sources.
  • 2024-05-14HBase Deprecation at Pinterest (Part 1). Pinterest Storage + Data Infrastructure: part 1 of a 3-part retrospective announcing the 2021 decision to deprecate HBase across Pinterest's entire production footprint, after running one of the largest HBase deployments in the world (peak ~50 clusters / ~9,000 EC2 instances / >6 PB of data). Five-reason deprecation framework: maintenance cost, missing functionality, system complexity, infra cost, waning community. Workload-axis migrations already in flight: OLAP → Druid + StarRocks; time-series → Goku; KV → KVStore on RocksDB + Rocksplicator. Remaining slot drove TiDB selection.

Architectural themes

Pinterest's wiki corpus currently spans five axes — storage (HBase deprecation, TiDB selection), quota governance (Piqama + Moka + PinConf + SPF), ads ML production debugging (online- offline discrepancy), ads model unification (one unified engagement model with surface-specific specialisation), and production LLM analytics (Analytics Agent + Text-to-SQL on top of PinCat governance + unified context-intent embeddings + internal Vector DB platform). A common thesis connects them: fragmentation is expensive; deliberate consolidation pays off when paired with efficiency work. The HBase deprecation retrospective frames this at the storage-substrate layer (too many bolt-on services on one datastore → workload-specific migration → NewSQL consolidation for the remainder); the ads engagement unification frames it at the ML- model layer (three surface-specific models → one unified model with surface-specific tower trees + calibration + checkpoints); the Analytics Agent frames it at the LLM-infrastructure layer (every team reinventing vector indexes + table search + ad-hoc RAG → one Vector DB platform + one governance catalog + one shared intent index).

Three orthogonal operational levers appear across these posts: workload / surface specialisation (where generalisation fails, specialise narrowly via tower trees or workload-specific stores), per-segment refinement (surface-specific calibration is the ads analogue of workload-specific stores), async/decoupled control plane (Piqama's async rule distribution, MediaFM-style decoupled training/serving boundaries, HBase's standby cluster for offline workflows), and — new with the Analytics Agent post — governance as AI infrastructure (tier tags + glossary terms + lineage are not documentation hygiene, they are load-bearing inputs to the ranker and the SQL validator).

Last updated · 319 distilled / 1,201 read