CONCEPT Cited by 1 source
Observability cost scaling¶
Definition¶
Observability cost scaling describes the phenomenon where the cost of monitoring, logging, and metrics collection grows linearly or super-linearly with tenant/account count in multi-tenant systems, and can exceed the cost of the compute resources being monitored. This creates a cost inversion where you pay more to observe the system than to run it.
The ProGlove case¶
At ~$3/account/month for forwarding CloudWatch logs and metrics to a third-party observability platform, the cost was negligible at dozens of accounts. At thousands of accounts, it nearly doubled the total cloud bill — exceeding Lambda compute and DynamoDB storage costs combined.
"What came as a surprise to us were the actual cost drivers: instead of Lambda compute or storage costs, we found that forwarding all observability data almost doubled our cloud bill." (Source: sources/2026-06-29-aws-lessons-learned-from-scaling-to-1-million-lambda-functions)
Mitigation strategies¶
- Priority-based data segregation. Differentiate high-priority telemetry (error rates, latency spikes) from low-priority data (debug logs, verbose traces). Only forward the high-priority subset continuously.
- Idle-account hibernation. For accounts with no active workload, reduce monitoring to a minimal set of "heartbeat" metrics. ProGlove switched accounts to "almost 0" observability after inactivity.
- Local pre-aggregation. Aggregate metrics at the account level before forwarding, reducing per-event transmission cost.
- Tiered retention. Short retention for verbose data, long retention only for aggregated summaries.
ProGlove reduced from ~$3/account to ~$0.70/account with these combined mitigations — a 77% reduction.
Architectural implication¶
In scale-to-zero architectures, the monitoring tax creates a practical floor: even if compute truly scales to zero, the cost of knowing the system is healthy cannot. True "zero" is "almost-zero" — the monitoring floor dominates idle cost.
Seen in¶
- sources/2026-06-29-aws-lessons-learned-from-scaling-to-1-million-lambda-functions — observability costs at $3/account nearly doubled ProGlove's cloud bill; optimized to $0.70/account.