Skip to content

SYSTEM Cited by 1 source

AWS CloudFormation StackSets

What it is

AWS CloudFormation StackSets is the CloudFormation feature that deploys a single CloudFormation template as parallel stacks into many target AWS accounts and Regions at once, from a central "administrator" account. One StackSet update operation fans out to every target account without the operator creating N separate CloudFormation stacks manually.

StackSets is the standard mechanism for fleet-wide infrastructure changes in an AWS Organizations-scale platform — and is load-bearing for account-per-tenant SaaS at ProGlove's scale (~6,000 tenant accounts).

Shape

  • Administrator account — the account that owns the StackSet definition and orchestrates deployments.
  • Target accounts (StackInstances) — accounts/Regions where the template is instantiated. Can be specified by account list, OU, or whole Organization (with AWS Organizations integration).
  • StackSet template — standard CloudFormation template; may accept per-target parameter overrides for tenant-specific values.
  • Operations — create/update/delete/detect-drift, with tuneable concurrent-operation and failure-tolerance settings to trade parallelism against blast-radius.

ProGlove's deployment topology

"Each pipeline execution updates many target accounts in parallel, with only a single StackSet update operation in a central account." (Source: sources/2026-02-25-aws-6000-accounts-three-people-one-platform)

The architecture:

  1. Monorepo holds all services (enforces single version of shared libs / Lambda layers across the fleet).
  2. Push to the deploy branch triggers CodePipeline in the central Infrastructure account.
  3. CodePipeline runs a StackSet update operation.
  4. StackSet fans out to every target tenant account in parallel.

This is the canonical instance of patterns/fan-out-stackset-deployment in the wiki.

Named failure modes

"While this provides the necessary scale, it also introduces new failure modes" (Source: sources/2026-02-25-aws-6000-accounts-three-people-one-platform):

  • Partial rollouts. If one account fails to deploy, rollback or retry strategies need to be defined and tested — partial- rollout handling is not automatic; the platform must choose fail-fast vs continue-on-failure and test the recovery path.
  • Pipeline duration. Large-scale updates can take significant time to propagate. At 6,000 accounts, a single fleet-wide change is a long-duration operation the ops team has to plan around.
  • Tooling maturity. StackSets is described as "powerful but still evolving, and operational edge cases are possible" — the model works, but the SaaS-tenant-level application of StackSets is still bleeding-edge relative to the enterprise-level application it was built for.

What it doesn't solve on its own

  • Partial-rollout observability at fleet scale — needs the central telemetry tier to see which accounts deployed successfully and which didn't.
  • Per-account customization beyond parameter overrides — StackSet instances are mostly homogeneous.
  • Account creation itself — use Step Functions + Organizations APIs (patterns/automate-account-lifecycle).
  • CloudFormation-shape services only; anything that lives outside CloudFormation (data migrations, app-level config) needs its own fan-out mechanism.

Seen in

Last updated · 200 distilled / 1,178 read