CONCEPT Cited by 1 source
Push vs pull monitoring¶
A monitoring architecture is fundamentally either pull-based (the monitoring server periodically scrapes metrics from endpoints) or push-based (the monitored system pushes metrics into a collector / stream / backend as they're produced). Both shapes ship in production at scale; the choice axis is who drives the data flow, and the trade-offs cluster around three failure modes: API-throttling under fan-out, polling latency as freshness floor, and operational coupling between monitor and target.
This concept is the observability-altitude sibling of concepts/pull-vs-push-streams (which is the JS/streams API altitude trade-off). The same word, different load-bearing surface.
The two models at the monitoring altitude¶
| Axis | Pull-based | Push-based |
|---|---|---|
| Canonical instance | Prometheus scraping /metrics |
CloudWatch Metric Streams, StatsD, OTel push exporter |
| Who drives | Monitor server | Monitored system / source |
| Scrape / emit cadence | Configurable interval at scrape side | Event-driven at source |
| Freshness floor | Scrape interval (often 15–60s) | Source emission cadence |
| API call shape | One scrape per metric per interval | One push per metric per emission |
| Failure mode at scale | API throttling, scrape misses | Producer-side back-pressure, lost emissions |
| Service-discovery requirement | Mandatory — monitor needs target list | Optional — sources self-identify |
| Network direction | Monitor → target | Source → collector |
Why pull-based monitoring throttles at scale¶
The 2026-05-13 source canonicalises the production failure mode of pull-based monitoring at scale. The customer ran Prometheus with the AWS CloudWatch exporter — a pull-based shape that scrapes CloudWatch for each configured metric on a fixed interval. At fleet scale, two things go wrong:
"Our customer's current monitoring solution with Prometheus and Amazon CloudWatch exporter using a pull-based approach resulted in higher API throttling. This caused metric loss and created gaps in observability data for business-critical systems. The frequent polling approach in this model also resulted in higher costs from API calls. This polling solution did not satisfy their requirement of sub-minute latency for real-time alerting." (Source)
Three named failure modes:
- API throttling — every
pull-side scrape is an API call against the upstream metrics
provider. CloudWatch's
GetMetricData/GetMetricStatisticscalls are quota-throttled; at thousands of metrics × dozens of targets × 60s scrape interval the call rate saturates and exporters drop metrics. - API cost amplification — even when not throttled, per-call pricing turns the polling overhead into a meaningful bill line.
- Polling-interval freshness floor — pull-based monitoring cannot deliver sub-minute alerting if the scrape interval is 60s, regardless of how fast the target produces fresh values. The polling interval is structurally a freshness budget the monitor cannot beat.
Why push-based monitoring fixes the API-throttling axis¶
Push-based monitoring inverts the data flow: the source emits the metric (or a metric stream emits a snapshot of it), and the emission goes once to the collector. Architectural consequences:
- No per-metric monitor-side API calls. The monitor doesn't fetch — it receives. There's no API quota in the monitor's scrape direction, because there's no scrape.
- Sub-second freshness in principle. The lower bound on monitor-side latency is the source-side emission cadence, not a fixed scrape interval.
- Producer-side back-pressure replaces monitor-side throttling. The push pipe (StatsD UDP, Metric Streams + Firehose, OTel exporter HTTP) becomes the constrained resource. Lost emissions show up as producer-side queue overflow rather than monitor-side scrape failures.
- Service-discovery becomes optional. Sources self-identify on push; the monitor doesn't need a complete target list to collect from a new source.
What push monitoring gives up¶
- Query-rate control. Pull lets the monitor decide when to ask. Push gives that control to the source — which can over- emit (high cost, high collector load) or under-emit (gaps).
- Implicit liveness signal. A target that fails to respond to a scrape is implicitly dead from the monitor's perspective. In a push system, a silent source can't be distinguished from a healthy source with nothing to say without explicit heartbeats.
- Centralised relabeling / filtering. Prometheus's pull-side relabel-configs apply transformations at scrape time. Push systems push that responsibility upstream to the source or downstream to the collector — typically requiring a richer collector tier (e.g. the OTel collector's processor stage).
- Long-lived target tracking by absence. Pull monitors know when a target stops appearing; push monitors need explicit TTL or heartbeat semantics on metrics to detect silence.
Mixed shapes are common¶
In practice many production architectures do both: push for high-rate application metrics where freshness matters, pull for infrastructure metrics where service discovery is the harder problem. The 2026-05-13 source's customer architecture itself is a hybrid — push at the collector ingress (CloudWatch Metric Streams → Firehose → Lambda → NLB → OTel collector), but the collector then exports onward to potentially pull-shaped backends (e.g. Grafana Cloud).
Seen in¶
- sources/2026-05-13-aws-streaming-cloudwatch-metrics-to-vpc-based-opentelemetry-collectors-using-lambda — first canonical wiki home. The customer migrated off Prometheus + CloudWatch-exporter (pull) onto CloudWatch Metric Streams + Firehose + Lambda + OTel collector (push). The load-bearing rationale is the API-throttling failure mode: "resulted in higher API throttling. This caused metric loss and created gaps in observability data for business-critical systems." The push architecture's stated benefits — "reducing frequent polling and API calls, enabling near real-time data transmission" and "sub-minute latency for real-time alerting" — are the inverse of pull's named failures.
Related¶
- concepts/pull-vs-push-streams — the streams-API-altitude sibling of this concept. Different surface, same axis.
- concepts/api-throttling — the canonical pull-side failure mode.
- concepts/polling-interval-as-freshness-budget — why pull's scrape interval bounds latency.
- concepts/metric-staleness-from-polling-layers — the multi-layer compounded version of the polling freshness floor.
- concepts/observability — the parent concept.
- systems/prometheus — canonical pull instance.
- systems/amazon-cloudwatch-metric-streams — canonical push instance on AWS.
- systems/opentelemetry — push-friendly open standard.