CONCEPT Cited by 1 source
Per-account quotas¶
What it is¶
Most AWS service limits ("service quotas") are enforced per AWS account. In a shared-account architecture this is a single, easily-monitored number; in an account-per-tenant shape every tenant account is independently quota-limited, and the cumulative fleet-level picture is the aggregate across thousands of independent quota envelopes.
"AWS service limits are enforced per account. In a shared-account model, you monitor a single set of quotas. In an account-per-tenant setup, quota management becomes distributed and harder to predict. Proactive quota requests and monitoring are essential." (Source: sources/2026-02-25-aws-6000-accounts-three-people-one-platform)
Canonical example: Lambda concurrent-executions quota¶
The Lambda concurrent-executions quota is per-account and shared across all functions in that account. Under a heavy-load tenant:
- In a shared-account model, one heavy tenant could consume the whole quota and starve the rest — classic noisy-neighbour failure mode, but the quota observability is a single dashboard.
- In an account-per-tenant model, a heavy tenant only throttles itself (the blast-radius win), but every tenant account is its own throttling envelope, and heavy-tenant accounts may experience unexpected throttling if their account-level quota hasn't been raised proactively.
"A tenant is under heavier load, it's likely for the corresponding account to experience throttling errors of Lambda functions, which is why it's essential to provide a single pane of glass view to keep track of the quota usage and adapt as necessary." (Source: sources/2026-02-25-aws-6000-accounts-three-people-one-platform)
Operational implications of distributed quotas¶
- Quota monitoring becomes a first-class platform concern. Per-account dashboards are not enough; a cross-account aggregated view is required to find the tenant accounts that are approaching limits before they throttle.
- Proactive quota-raise automation. Raises must often be requested per account; platform tooling has to know which accounts are near limits and file increase requests ahead of the demand curve.
- Trade-off against the blast-radius win. The same account boundary that stops a heavy tenant from starving others also prevents them from borrowing unused quota from quiet neighbours. Fleet-level utilization is inherently lower.
- Service-mix bias. Services whose quotas are hard to raise (or which have soft per-account ceilings distinct from the hard service-level ceilings) become higher-risk at high account count.
Related failure modes¶
- Account-creation throttling. AWS has account-creation rate limits; lifecycle automation (patterns/automate-account-lifecycle) has to respect them.
- Cross-region quota independence. Most quotas are both per-account and per-region, compounding the matrix.
- Support-case fan-out. Per-account quota raises often require support tickets; thousands of accounts means scripting the support flow or negotiating programmatic-raise access with AWS.
Seen in¶
- sources/2026-02-25-aws-6000-accounts-three-people-one-platform — ProGlove's 6,000-account fleet; Lambda concurrent-executions quota named as the canonical distributed-quota-management problem, with a "single pane of glass" tracker prescribed as essential.
Related¶
- concepts/account-per-tenant-isolation — the architecture that forces this concept to matter.
- systems/aws-organizations — the fabric across which per-account quotas aggregate.
- systems/aws-lambda — canonical source of per-account quotas in ProGlove's story.