CONCEPT Cited by 1 source
Fine-Grained Billing¶
Fine-grained billing is the practice of charging customers at a very small unit of actual consumption (milliseconds of CPU, bytes of memory × time) rather than at the granularity of a provisioned unit (per hour, per reserved instance).
Why it's an architectural commitment, not a pricing choice¶
In the 2014 Lambda PR/FAQ, fine-grained duration-based pricing was tenet #4: "Our service will target fine-grained pay-for-use; developers will not pay for idle time. We will own the problem of application placement so that developers never experience waste through underutilized hosts. We will seek to minimize both costs and billing granularity."
Key alignment claim: "Our pricing approach ensures that customers cannot overprovision or underutilize by design: customers utilize 100% of the computing power they're paying for when they run an application."
(Source: sources/2024-11-15-allthingsdistributed-aws-lambda-prfaq-after-10-years)
Lambda's granularity evolution¶
| Era | Granularity |
|---|---|
| 2014 PR/FAQ | 250ms |
| Nov 2014 launch | 100ms |
| Today (2024) | 1ms, no minimum |
Billing granularity is a customer-alignment lever, tightened over time — each shrink incentivises further customer-side optimisation because saved milliseconds now translate 1:1 into saved dollars.
What it forces the provider to build¶
- Placement compaction. Per-account scheduling that minimises the instance count used for that account's workload; spiky and heterogeneous jobs can be packed without user effort (see concepts/scale-to-zero).
- Multi-tenant isolation primitive cheap enough that the provider doesn't lose money running idle containers for sparse workloads. See concepts/micro-vm-isolation.
- Precise metering infrastructure. Per-ms CPU + per-MB memory × duration for potentially billions of invocations per day.
Trade-off¶
Very fine granularity shifts the incentive to reduce code-path latency to the customer: "Developers can lower costs and improve performance by minimizing startup and shutdown overhead, ensuring that each time an application is invoked it immediately begins doing useful work." The provider, in exchange, has to keep cold-start latency low enough that this game is winnable.
Seen in¶
- sources/2024-11-15-allthingsdistributed-aws-lambda-prfaq-after-10-years — tenet-level articulation; 250ms → 100ms → 1ms trajectory.