SYSTEM Cited by 1 source
Slack AI¶
Slack AI is the LLM-powered feature suite inside Slack that covers AI-driven channel and thread summaries, Recap (catch-up of activity since last visit), AI Search (high-reasoning, context-aware answers across the workspace), and related generative / extractive surfaces. The product line was built starting in early 2023 to give enterprise customers "security, reliability, and performance our customers expect" (Source: sources/2026-05-28-slack-slack-ai-the-path-to-multi-cloud).
The wiki canonicalises Slack AI as the consumer surface whose serving substrate evolved through four phases:
| Phase | Period | Substrate | Wiki canonical |
|---|---|---|---|
| 1. SageMaker era | early 2023 | AWS SageMaker with escrow VPC for Anthropic | Multi-region; cross-region IAM; ODCR + cron scaling |
| 2. Bedrock migration | mid-2024 | Amazon Bedrock (PT) | Zero-incident migration; MUs as capacity unit |
| 3. Bedrock On-Demand + Hybrid | mid-2025 | Bedrock PT + OD + spillover | patterns/provisioned-throughput-with-on-demand-spillover |
| 4. Multi-cloud | early 2026 | Bedrock + GCP Vertex AI | systems/slack-intelligent-routing-layer + patterns/model-fallback-hierarchy-with-circuit-breaker |
Feature workload shapes (disclosed)¶
The 2026-05-28 article names three workload shapes to motivate the hybrid + multi-cloud routing decisions:
- High-volume, latency-sensitive features — channel summaries and similar surfaces that need a "snappy" feel. These were kept on Provisioned Throughput in Phase 3 specifically to guarantee consistent latency.
- Asynchronous, bursty workloads — nightly Recaps is the canonical example, used to motivate the move to On-Demand capacity. Recap-class features can have 10× variance between peak and off-peak hours, the verbatim figure that justified OD over peak-provisioned PT.
- High-reasoning features — AI Search is the canonical example named for the ~10% quality lift Phase 4 multi-cloud enabled — specifically by routing to "new high-reasoning models" available on different clouds.
Reported quantitative outcomes (Phase 4)¶
- ~10% improvement in quality metrics for complex reasoning tasks (post: "more precise, context-aware answers").
- ~67% reduction in latency for high-velocity, low-token workloads.
Engineering principles (canonicalised by the post)¶
- "Measure first, migrate gradually, and monitor continuously." — solidified by the Phase 2 zero-incident migration.
- "The abstraction layer is a core requirement" — the Intelligent Routing Layer dominates the model choice.
- "Treat architecture as a living document" — provider- agnostic routing lets Slack adopt breakthroughs without a rewrite.
- "Reliability requires provider agnosticism" — internal failovers within one cloud aren't enough.
- "An LLM service that is 'up' but slow is effectively broken" — soft failures (p90 spikes, feedback trends) are first-class triggers for the routing layer.
Substrate composition¶
Slack AI sits on top of Slack's broader engineering substrate:
- Slack Bedrock (Kubernetes platform) — Slack AI's application-tier services run on Slack's internal Kubernetes substrate (not to be confused with Amazon Bedrock, the LLM service).
- Intelligent Routing Layer — the LLM abstraction layer.
- AWS Bedrock + GCP Vertex AI — the multi-cloud LLM serving endpoints.
Compliance posture¶
- FedRAMP Moderate maintained across all phases. Phase 2 Bedrock migration was specifically gated on Bedrock having "achieved FedRamp Moderate compliance" and matching SageMaker's security posture.
- Escrow VPC in Phase 1 established the zero-knowledge property: "our data remained private to Slack, and the provider's proprietary model weights remained inaccessible to us."
- Multi-cloud regional data boundaries — Phase 3 OD relied on Bedrock's cross-US-region routing "while adhering to our regional data boundaries." Phase 4 GCP integration required "Security, Risk and Compliance, Trust and Integrity, AI Quality, Legal, and Cloud Providers" alignment to ensure data boundaries remained ironclad.
Seen in¶
- sources/2026-05-28-slack-slack-ai-the-path-to-multi-cloud — the canonical wiki disclosure of Slack AI's three-year LLM serving evolution from single-region SageMaker to multi-cloud AWS Bedrock + GCP Vertex AI behind the Intelligent Routing Layer.
Related¶
- systems/slack-intelligent-routing-layer — the LLM abstraction layer that fronts all model providers.
- systems/aws-sagemaker-ai — Phase 1 substrate.
- systems/amazon-bedrock — Phase 2/3 substrate.
- systems/gcp-vertex-ai — Phase 4 added cloud.
- concepts/multi-cloud-llm-serving
- concepts/escrow-vpc-llm-serving
- patterns/multi-cloud-llm-serving
- companies/slack