SYSTEM Cited by 1 source
GCP Vertex AI¶
Google Cloud Platform Vertex AI is GCP's unified managed ML platform that includes a hosted LLM model garden — Google's own Gemini family plus third-party models including Anthropic and others — with managed serving, fine-tuning, evaluation, and observability surfaces. Sibling system to Amazon Bedrock (AWS) and Amazon SageMaker (AWS).
Stub page on the wiki — expand as Vertex AI internals are disclosed in future ingests.
Wiki canonical role: enterprise multi-cloud LLM endpoint¶
The 2026-05-28 Slack AI multi-cloud retrospective is the wiki's first canonical disclosure of GCP Vertex AI as an enterprise multi-cloud LLM endpoint — specifically as the second cloud in Slack AI's Intelligent Routing Layer alongside AWS Bedrock.
Slack named four reasons for adding Vertex AI in early 2026 to their AWS-only stack (Source: sources/2026-05-28-slack-slack-ai-the-path-to-multi-cloud):
- Infrastructural redundancy & high availability — verbatim: "a multi-cloud footprint eliminates provider-level large scale infrastructural disruptions as a single point of failure. If an entire cloud ecosystem experiences a regional or platform-wide disruption, our traffic can be rerouted to a separate, healthy stack without service interruption."
- Model-to-feature optimisation — verbatim: "By expanding our catalog to include multiple models, we gained the ability to match the specific latent strengths of a model to the specific requirements of a feature. This granular optimization led to immediate performance gains: ~10% improvement in quality metrics for complex reasoning tasks. ~67% reduction in latency for high-velocity, low-token workloads."
- Access to innovation — verbatim: "The AI landscape moves at extreme velocity with frequent vendor exclusivity. Multi-cloud ensures we are ready to integrate with the latest breakthroughs regardless of where they are hosted while upholding our compliance, privacy, and security promises."
- Dynamic workload orchestration — verbatim: "Beyond simple failover, multiple providers allow for sophisticated traffic shaping. We can route requests based on real-time telemetry – evaluating not just provider health, but which endpoint offers the optimal performance profile for a given workload at that exact moment."
Integration friction (disclosed)¶
The Slack post discloses two named integration challenges that were resolved as cold-start engineering work for the Vertex AI addition:
- Secretless authentication — explicit statement that Slack "solved cold start engineering hurdles by implementing secretless authentication" for cross-cloud access. Specific federation shape (workload identity federation? OIDC? short-lived service-account exchange?) not disclosed.
- API normalisation layer — "translates disparate provider signals into a unified language for our application logic" — see patterns/api-normalization-layer-cross-provider.
Operational properties (disclosed)¶
- Compliance + privacy + security promises maintained alongside AWS — Slack's framing is that the multi-cloud expansion was gated on cross-cloud parity for the existing enterprise compliance posture (FedRAMP Moderate at minimum, per Slack AI's earlier-phase requirements).
- Vendor-exclusive models — frame for "access to innovation" implies state-of-the-art models that are exclusive to GCP at any given moment (e.g. some Gemini generation behaviours, certain Anthropic capabilities, etc.) — specific model SKUs not enumerated by Slack.
- Cross-cloud routing — Slack's routing layer routes between Bedrock and Vertex AI based on metric-driven model selection per feature and on real-time health signals.
Open questions (from the Slack disclosure)¶
- GCP region selection — which Vertex AI regions Slack uses, how regional data boundaries are honoured.
- Per-cloud traffic share at Phase 4 — not disclosed.
- Model SKUs Slack uses on Vertex AI — not enumerated.
- Vertex AI capacity primitives — does Slack use Vertex AI Provisioned Throughput equivalent (Vertex AI offers "provisioned throughput" SKUs) or on-demand?
- Vertex AI eval / monitoring tooling — does Slack use Vertex AI's own evaluation services or its in-house judging?
Seen in¶
- sources/2026-05-28-slack-slack-ai-the-path-to-multi-cloud — the wiki's first canonical disclosure of GCP Vertex AI as an enterprise multi-cloud LLM endpoint alongside AWS Bedrock; named as the second cloud in Slack AI's Intelligent Routing Layer; four-reason rationale for multi-cloud expansion; secretless auth + API normalisation as cold-start integration work.
Related¶
- systems/amazon-bedrock — sibling AWS managed-LLM endpoint.
- systems/aws-sagemaker-ai — sibling AWS managed-ML platform.
- systems/slack-intelligent-routing-layer — Slack's cross-cloud router.
- systems/slack-ai — the consumer feature suite that benefits from multi-cloud expansion.
- concepts/multi-cloud-llm-serving — the architectural posture.
- concepts/concentration-risk-single-cloud-llm — the structural failure mode addressed by adding Vertex AI alongside Bedrock.
- concepts/model-to-feature-binding — the optimisation that yielded Phase 4's quality and latency wins.
- patterns/multi-cloud-llm-serving — the meta-pattern.