PATTERN Cited by 1 source

Configuration-driven tenant onboarding¶

Pattern¶

Treat new-tenant onboarding as a configuration change, not an infrastructure-provisioning exercise. All infrastructure the new tenant depends on — VPC, subnets, load balancer, IAM roles, PrivateLink endpoints, downstream- service connections — is pre-wired at tier creation (see concepts/pre-integration-at-tier-creation). Onboarding reduces to: register a listener rule, create a target group, create a dedicated ECS cluster, deploy tenant configuration, validate.

Canonicalised on the wiki by the 2026-05-12 AWS Architecture Blog post (Source: sources/2026-05-12-aws-building-hybrid-multi-tenant-architecture-for-stateful-services). Verbatim:

"Configuration-driven onboarding: New tenant onboarding became a configuration change rather than an infrastructure provisioning exercise, dramatically reducing time and manual effort."

Before and after (from canonical source)¶

Onboarding phase	Before (account-per-tenant)	After (configuration-driven)
AWS account provisioning	~2 weeks	N/A (shared account)
VPC + networking	~3 weeks	N/A (inherited from infra group)
IAM role configuration	~1 week	N/A (tier-level shared role)
Downstream integration	~2 weeks	N/A (tier-level PrivateLink)
Product configuration + test	included	~7 days
Total	~52 days	~7 days (−86%)

The 80% engineering-effort reduction per onboarding is attributed to the removal of the first four phases, not to speedup of the fifth.

The new-tenant checklist¶

After this pattern, onboarding a new tenant is:

Add ALB listener rule routing the tenant's path (or header) to a new target group. See patterns/alb-path-routing-per-tenant.
Create target group for the tenant's backend.
Create dedicated ECS cluster for the tenant. See patterns/dedicated-ecs-cluster-per-tenant.
Register ECS service in the target group.
Deploy tenant configuration — task-definition env vars with TENANT_ID, cache endpoint, resource sizing.
Validate — integration test against downstream services, smoke test the tenant path, confirm memory / latency baselines.

Steps 1–4 are ~3–5 AWS API calls. Step 5 is an ECS task deployment. Step 6 is the residual 7-day cost — "primarily testing and validation, because infrastructure is pre- provisioned."

Why this is hard without pre-integration¶

Without the concepts/pre-integration-at-tier-creation lever, every onboarding triggers network-engineering work: VPC peering, PrivateLink setup, IAM role creation, cross-account trust relationships. This work is slow because:

Multiple teams are involved (network, security, downstream- service team, requesting team).
Approvals are gated on security review.
Integration testing requires the downstream-service owner's participation.
Documentation and operational handoff add days.

None of this work is tenant-specific; it's infrastructure provisioning that could be done once. The pre-integration pattern does exactly that, unlocking configuration-driven onboarding.

Composition with adjacent patterns¶

patterns/shared-privatelink-at-tier-level — shares downstream-service connectivity across tenants.
patterns/dedicated-ecs-cluster-per-tenant — the compute backend spun up per tenant.
patterns/alb-path-routing-per-tenant — the routing layer that gets a new rule per tenant.
patterns/hybrid-multi-tenant-architecture — the enclosing architectural shape.

All four patterns together deliver the onboarding-speedup property; omitting any one of them re-introduces per-tenant infrastructure work.

Configuration artifacts¶

The tenant's onboarding configuration is small:

ECS task definition JSON (env vars, image, resources)
ALB listener rule (path / header pattern, priority)
Target group (name, protocol, port, health-check path)
ECS service definition (cluster, task count, autoscaling triggers)
CloudWatch alarms (memory 70/85, latency 2× baseline, 5XX rate)

All expressible as CloudFormation / CDK / Terraform. No code changes to the application layer (the application reads TENANT_ID from env and behaves accordingly).

When to use¶

Multi-tenant SaaS with tens to thousands of tenants where per-tenant onboarding cost is a business constraint.
Tenants with shared downstream-service topology — heterogeneous dependencies defeat the shared-endpoint property.
Stable, well-understood tier profiles — the tier has to be designed for all likely tenants before any tenant is onboarded.

When not to use¶

Customers require bespoke infrastructure — per-tenant downstream service integrations, per-tenant networking, per- tenant IAM policies. Each bespoke requirement re-introduces per-tenant provisioning.
Tier definitions are unstable — if the tier itself changes often, the amortisation doesn't compound.
Regulatory requirements force per-tenant boundaries — concepts/account-per-tenant-isolation is mandated.
Very small numbers of tenants (<10) — account-per-tenant onboarding cost amortises acceptably.

Anti-patterns¶

Calling onboarding "configuration-driven" while still running per-tenant provisioning scripts. The name is misleading if infrastructure is still being created per tenant.
Tenant-specific tier customisation via feature flags. Feature flags inside the tenant's task are fine; feature- flag-driven infrastructure (per-tenant target group count, per-tenant subnet sets) re-couples onboarding to infrastructure.
Per-tenant IAM roles disguised as configuration. Creating a new IAM role per tenant is still an IAM operation, not a configuration change.
Shipping onboarding as a manual runbook rather than automation. Even the remaining 7 days is improvable; manual runbooks don't compound.

Measured outcomes (AWS canonical)¶

From the post:

Tenant onboarding time: 52 days → 7 days (−86%)
Infrastructure setup steps per tenant: −80%
Engineering effort per onboarding: −80%
Feature release time: 2–3 days → 1 day
Tenant capacity: up to 100 tenants per AWS account

The feature release time reduction is a secondary dividend: once onboarding is configuration-driven, product-configuration changes flow through the same pipeline, enabling 1-day releases.

Caveats¶

The 7-day residual is "testing and validation." The post doesn't quantify how much is mandatory (customer testing, SLA compliance) vs improvable (automated smoke tests, pre-warmed caches).
Not all onboarding cost is engineering cost. Commercial (contract, legal, pricing), security review, and customer- side integration effort aren't in the 52 / 7 day numbers but still bound customer time-to-value.
Tier-creation cost is not amortised into the onboarding number. Tier creation itself can take weeks, but it happens once per tier, not per tenant.
Configuration mistakes can cause outages. Listener-rule priority collisions, target-group misconfig, ECS task- definition typos. The blast radius depends on how much of the configuration pipeline is automated vs manual.

Seen in¶

sources/2026-05-12-aws-building-hybrid-multi-tenant-architecture-for-stateful-services — canonical wiki anchor. AWS ad-serving platform's migration to configuration-driven tenant onboarding. 52d → 7d onboarding reduction explicitly attributed to pre-integration; "a configuration change rather than an infrastructure provisioning exercise" verbatim; 80% engineering-effort reduction disclosed.

concepts/pre-integration-at-tier-creation — the enabling lever
concepts/tenant-onboarding-time — the metric this pattern optimises
concepts/hybrid-multi-tenant-architecture — the enclosing shape
patterns/hybrid-multi-tenant-architecture
patterns/shared-privatelink-at-tier-level — the dependency-sharing mechanism
patterns/alb-path-routing-per-tenant — the per-tenant routing rule
patterns/dedicated-ecs-cluster-per-tenant — the per-tenant compute backend