PATTERN Cited by 1 source
Fan-out StackSet deployment¶
Pattern¶
A single CI/CD pipeline in a central administrator account triggers one StackSet update operation that fans out the change to hundreds or thousands of target accounts in parallel. The target accounts are selected by OU membership or explicit list; each target receives an identical (or parameter-customised) CloudFormation stack deployment.
Canonical shape¶
┌─── Infrastructure account ────────────────┐
│ Monorepo (single source of truth) │
│ ↓ │
│ CodePipeline │
│ ↓ │
│ CloudFormation StackSets admin │
│ ↓ (single StackSet update op) │
└────────────┬──────────────────────────────┘
│ (parallel fan-out)
┌────────┼────────┬──── … ────┐
▼ ▼ ▼ ▼
Tenant Tenant Tenant Tenant
acct 1 acct 2 acct 3 acct N
"Each pipeline execution updates many target accounts in parallel, with only a single StackSet update operation in a central account." (Source: sources/2026-02-25-aws-6000-accounts-three-people-one-platform)
Why this shape¶
- One artefact, one version, one rollout trigger — the only way to enforce a single version of shared libraries / Lambda layers across a thousands-of-accounts fleet without per-account drift.
- Monorepo is load-bearing. The pipeline deploys from one repo precisely so that the source of truth is single-versioned.
- Administrator-account isolation — operational authority lives in the Infrastructure account, not the tenant accounts; tenant-account IAM only needs to trust the StackSet-execution role, not the whole pipeline.
- Parallelism tuning — StackSets exposes concurrent-operation and failure-tolerance knobs that let the platform trade deployment speed against blast-radius of a bad change.
Named failure modes¶
- Partial rollouts. "If one account fails to deploy, rollback or retry strategies need to be defined and tested." Partial- rollout handling is not automatic; the platform chooses fail-fast vs continue-on-failure and must own the recovery path.
- Pipeline duration. "Large-scale updates can take significant time to propagate." Fleet-wide changes are long-duration operations that have to be planned around maintenance windows and surrounding ops activity.
- Tooling maturity. StackSets is "powerful but still evolving, and operational edge cases are possible" — bleeding-edge territory at SaaS-tenant scale, though mature at enterprise scale.
(All Source: sources/2026-02-25-aws-6000-accounts-three-people-one-platform)
Mitigations¶
- Staged rollouts — use OU grouping to deploy to canary OUs first, widening fan-out after dwell time.
- Per-stage health checks — couple with patterns/central-telemetry-aggregation so the pipeline can read fleet-wide signals and abort on regression.
- Failure-tolerance + retry knobs — tune StackSet operation preferences to the blast-radius appetite for each change type.
- Test recovery paths regularly — "defined and tested" is load-bearing; an un-exercised rollback path is not a rollback path.
Alternatives this pattern beats¶
- N pipelines, one per account. Linear-in-accounts overhead; drift risk; doesn't scale at thousands of accounts.
- Per-service-owner deploys. Makes every service owner reason about multi-account topology; contradicts patterns/platform-engineering-investment.
- Non-CloudFormation IaC with custom fan-out. Viable (Terraform
- per-workspace backend, Pulumi + micro-stacks) but re-implements StackSets; appropriate only if the IaC is already non-CFN.
What this pattern doesn't cover¶
- Account creation — use patterns/automate-account-lifecycle (Step Functions) instead.
- Non-CloudFormation changes — data migrations, app-level config changes, runtime feature-flag flips — need their own fan-out tier.
- Per-account customization beyond parameters — parameter overrides are the extent of per-tenant tailoring available without splitting into multiple StackSets.
Seen in¶
- sources/2026-02-25-aws-6000-accounts-three-people-one-platform — canonical production instance at ProGlove (~6,000 accounts).
Related¶
- systems/aws-stacksets, systems/aws-codepipeline, systems/aws-cloudformation.
- concepts/account-per-tenant-isolation — the architecture that makes this pattern load-bearing.
- patterns/automate-account-lifecycle, patterns/central-telemetry-aggregation, patterns/platform-engineering-investment — the companion patterns in an account-per-tenant platform.
- patterns/staged-rollout — canary → wider rollouts; complementary.