SYSTEM Cited by 1 source
Slack Shipyard¶
What it is¶
Shipyard is the brand-new EC2 ecosystem Slack is building as a successor to the legacy Chef-based EC2 platform, designed specifically for teams that can't yet move to the container-based platform Bedrock. Announced in the 2025-10-23 phase-2 Chef post.
This wiki stub exists to anchor Shipyard as the named platform-engineering roadmap item; the actual Shipyard post has not yet been published (and therefore not ingested).
Named design goals¶
From the phase-2 Chef post (verbatim):
"Shipyard isn't just an iteration of our old system — it's a complete reimagining of how EC2-based services should work. It introduces concepts like service-level deployments, metric-driven rollouts, and fully automated rollbacks when things go wrong."
Three load-bearing goals:
- Service-level deployments — per-service isolation. The post explicitly names this as the architectural ceiling of the legacy Chef-based platform: "In theory, we could create a dedicated set of Chef environments for each service and promote artifacts individually — but with the hundreds of services we operate at Slack, this quickly becomes unmanageable at scale." Shipyard is built to lift that ceiling.
- Metric-driven rollouts — Shipyard rollouts will be gated on production metrics, mirroring the discipline already established in ReleaseBot on Webapp backend and extended in the Deploy Safety Program.
- Fully automated rollbacks — automatic rollback on metric-regression detection, consistent with the Deploy Safety Program's 10-minute auto-remediate North Star (see patterns/automated-detect-remediate-within-10-minutes).
Rollout plan (as of 2025-10-23)¶
- Soft launch: Q4 2025 ("this quarter" at post's publish date 2025-10-23).
- First two teams onboarded for testing and feedback.
- The dedicated Shipyard blog post is promised by the same author: "I'm excited to share more about Shipyard in my next blog post — stay tuned!"
Relationship to existing Slack compute stack¶
┌─────────────────────────────┐
│ Slack Deploy Safety Program │
│ (10m auto / 20m manual SLOs)│
└──────────────┬──────────────┘
│
┌──────────────────┼──────────────────┐
│ │ │
▼ ▼ ▼
┌───────────┐ ┌──────────┐ ┌──────────┐
│ Slack │ │ Shipyard │ │ legacy │
│ Bedrock │ │ (new EC2)│ │ Chef EC2 │
│ (K8s PaaS)│ │ upcoming │ │ feat- │
│ preferred │ │ EC2-only │ │ complete │
└───────────┘ └──────────┘ └──────────┘
- Bedrock remains the preferred target for containerisable workloads.
- Shipyard is the "escape valve" for EC2-only workloads (hardware-tied, IP-pinned, deep-init, non-containerisable).
- Legacy Chef-based EC2 is feature-complete + maintenance- mode; phase 2 of the Chef work (AZ-bucketed environments + Chef Summoner) is the final architectural investment there.
Why this page is a stub¶
- Shipyard itself has not yet launched publicly (soft launch Q4 2025 with two pilot teams).
- No dedicated blog post exists yet.
- Internal architecture, mechanism, and implementation are undisclosed.
Open questions¶
- Is Shipyard built on Kubernetes, ECS, Nomad, or a custom substrate? The post does not disclose.
- Does Shipyard wrap Chef for per-node config, or replace Chef entirely with a different config primitive? Undisclosed.
- Is the Shipyard-equivalent of ReleaseBot a per-service orchestrator, or does Shipyard feed the centralised deployment orchestration system? Undisclosed.
- What is the migration path from legacy Chef-based EC2 to Shipyard? Undisclosed.
Caveats¶
- Stub-level. Zero mechanism content; this page exists to anchor the name and roadmap direction.
- Not yet launched. Soft-launched with two pilot teams as of 2025-10-23; not yet available to all Slack engineering.
- Subject to revision. Anything in the "next blog post" is not yet on the wiki record; this page will need an update when the Shipyard post is published.
Seen in¶
- sources/2025-10-23-slack-advancing-our-chef-infrastructure-safety-without-disruption — named as the successor to the legacy Chef-based EC2 platform; preview-only in this post.