Skip to content

PATTERN Cited by 1 source

Generic quota management platform

Pattern: build one quota-management platform that serves all quota kinds in the organisation — capacity, rate-limit, and application-specific — by making the per-domain hooks (schema, validation, dispatch, enforcement) pluggable. Use the shared platform for the common surfaces (lifecycle, authorization, UI / API, chargeback / budget loop, auto-rightsizing) and let each application domain plug in its own enforcement.

The problem this solves

A mature organisation naturally accumulates many quota systems: cluster schedulers have capacity quotas, API gateways have rate limits, databases have per-connection limits, LLM services have token budgets, etc. Each is typically bespoke. Consequences:

  • Duplicate infrastructure. N teams all built their own quota-config UI, audit log, owner-permission model.
  • Duplicate discipline. N teams independently figure out validation, rollout, rollback, chargeback.
  • Fractured governance. No single place to answer "what quotas does team X have across the company?" or "what's our total committed capacity?".
  • Integration tax for new domains. Every new quota-needing service reinvents the wheel.

A generic platform collapses the common surfaces to one.

Pinterest's implementation: Piqama

Piqama is the canonical wiki instance (Source: sources/2026-02-24-pinterest-piqama-pinterest-quota-management-ecosystem). Five pluggable surfaces:

  1. Schema management — the platform owns unique identifiers and hierarchical relationships; each application domain defines its own schema.
  2. Validation — domain-defined rules at schema + semantic levels, plus remote-service hooks for invariants like cluster-capacity sum-checks.
  3. Update dispatch — Piqama's default client, or Pinterest's PinConf (for the rate-limit variant), or a custom dispatcher.
  4. Enforcement — default Piqama-client enforcement, or application-specific (e.g. SPF in-process library for online-storage rate limits; Yunikorn queue configs for Moka capacity).
  5. Punishment strategies — default serve/drop, or application-specific degradation (graceful rejection, tier- weighted budget haircut).

The shared surfaces are: REST + Thrift portal; ownership model + authorization; usage-stats collection → Apache Iceberg on S3 lakehouse; auto-rightsizing service; chargeback / budget integration.

Design principles

  • Control-plane / data-plane separation is the architectural core. The control plane handles lifecycle; data plane handles request-time decisions. Each scales independently.
  • Pluggability at the edges, shared at the core. Every degree of freedom a particular domain needs is in a plugin; everything domain-neutral (authorization, audit, UI, rightsizing telemetry) is in the platform.
  • Integrate with existing substrate. Piqama rides PinConf for rate-limit-rule distribution rather than inventing a new pub/sub. Same argument for Iceberg for storage.
  • Close the feedback loop. The platform collects its own usage telemetry and consumes it — via auto-rightsizing and budget-enforced throttling — so quotas self-tune.

Integrations disclosed

  • Moka (capacity) — first-class integration; Piqama fully manages Moka's quota lifecycle.
  • TiDB and Key-Value Stores (rate-limit) — initial integration complete.
  • Planned: PinCompute (Kubernetes general-purpose compute), ML Training Platform, LLM Serving.

Caveats

  • Pluggability adds complexity. Every pluggable surface is a contract — domains can get it wrong, and breakage is harder to diagnose than in a purpose-built system.
  • Shared infrastructure = shared failure. When Piqama's control plane is unhealthy, every integrated domain feels it at rule-update time. Hence the local-enforcement design for rate limits — the data plane tolerates control-plane outages.
  • Cross-domain invariants are hard. "Sum across all domains" checks don't have a natural home when each domain owns its own enforcement.

Seen in

Last updated · 319 distilled / 1,201 read