Skip to content

PATTERN Cited by 1 source

Host-agent metrics API

Pattern

Run a per-host agent whose job is to locally collect system + database metrics and expose them over a simple HTTP (or similar) API. The central consumer — throttler, scheduler, monitor — polls that API instead of holding a persistent connection to every underlying metric source.

"A common solution is to set up an HTTP server, some API access point, which the throttler can check to get the host metrics. A daemon or an agent running on the host is responsible for collecting the metrics locally to the host. The throttler's work is now made simpler, as it may need to only hit a single access point to collect all metrics for a given host."

— Shlomi Noach, Source: sources/2026-04-21-planetscale-anatomy-of-a-throttler-part-2

When to use it

  • The central consumer can't hold direct connections to every metric source (security boundaries, environment restrictions).
  • Metric sources are heterogeneous — DB metrics, OS metrics, process metrics all gathered by different system calls — and a host-local agent is the right place to unify them.
  • Fleet scale past the connection budget of a single central process.

Shape

  ┌──────────────┐    HTTP / gRPC    ┌────────────────┐
  │   Central    │  ───────────────▶ │   Host Agent   │
  │ Consumer (T) │  poll /metrics    │  (per host)    │
  └──────────────┘                   └──────┬─────────┘
                                             │ local
                                             │ syscalls
                                     ┌────────────────┐
                                     │  DB + OS       │
                                     │  metric        │
                                     │  sources       │
                                     └────────────────┘

Accepted trade-offs

  • Layered polling staleness. Each polling layer contributes its own interval to the worst-case staleness bound. Canonical numeric: agent collects 1 Hz + consumer polls 1 Hz → up to 2 s worst-case staleness vs 1 s for direct access.
  • Upgrade / backwards-compat surface. "We then introduce an API or otherwise some handshake between the throttler and the metric collection component, which must be treated with care when it comes to upgrades and backwards compatibility."
  • Reliability dependency. "Metric collection is no longer in the hands of the throttler, and we must trust that component to grab those metrics." A hung agent produces silent metric blindness.
  • Effectively distributes the architecture. "In adding this agent component, even if it's the simplest script to scrape and publish metrics, we've effectively turned our singular throttler into a distributed multi- component system."

Why the trade-offs are usually acceptable

  • Connection count stays bounded — consumer holds one HTTP session per host instead of many DB-protocol connections per host.
  • Boundaries become auditable — the API surface is the security contract, not the full DB protocol.
  • Agents are independently deployable + upgradable — no consumer coupling.
  • 2-second staleness is fine for most throttling decisions — the throttler reject / accept policy tolerates sub-second drift well above single-poll floor.
  • patterns/sidecar-ebpf-flow-exporter (Netflix) is the eBPF-specific realisation of the same shape for network-flow metrics — agent collects via eBPF maps, exports over local API. The throttler pattern here is database-workload-specific but structurally identical.
  • Prometheus node_exporter is a canonical industry realisation — OS metrics scraped locally, exposed via /metrics for central Prometheus servers to pull.
  • Distinct from per-host distributed throttlers: in the agent pattern the decision logic stays central; here only the metric-collection component is moved host-local.

Seen in

  • sources/2026-04-21-planetscale-anatomy-of-a-throttler-part-2 — canonical wiki introduction. Shlomi Noach describes the evolution from monolithic throttler with direct access to agent-mediated throttler; frames the shift as the intermediate step that "effectively turns our singular throttler into a distributed multi-component system."
Last updated · 319 distilled / 1,201 read