SYSTEM Cited by 2 sources
Google TPU¶
Google TPU (Tensor Processing Unit) is Google's custom ASIC family for AI workloads, offered commercially via cloud.google.com/tpu and used internally as the primary training + serving substrate for Google's own large-scale AI products.
This is still a minimal-viable page — the wiki has currently ingested two posts that name TPU as the substrate of record (the 2025-11-04 Project Suncatcher announcement, and the 2026-05-28 Google Research I/O 2026 roundup) but neither decomposes the accelerator's internals (generation history, perf/watt, interconnect topology, compiler / XLA integration, pod sizing, availability-zone footprint, etc.). That depth will populate the page from future posts.
Known from the current corpus¶
-
TPU is the compute substrate chosen for Project Suncatcher's orbital constellation. The choice is load-bearing — it pins the space-based-AI-infrastructure programme to commercial commodity silicon rather than a purpose-built radiation-hardened space ASIC, which in turn pushes the radiation-tolerance problem into architectural / software mitigation rather than silicon mitigation (Source: sources/2025-11-04-google-exploring-space-based-scalable-ai-infrastructure).
-
TPU is the hot-path serving accelerator for Gemini 3.5 Flash. The 2026-05-28 I/O roundup post is the wiki's first canonicalisation of TPU in this role beyond substrate of record: Google's speculative-decoding extensions (block verification
- tree-structured drafting) are "highly optimized for Google's TPU architecture, maximizing hardware utilization to deliver substantially faster responses with no loss in quality" — enabling "the current speed of Gemini 3.5 Flash, with the same models also powering Antigravity and AI Studio" (Source: sources/2026-05-28-google-a-new-era-of-innovation-google-research-at-io-2026). Canonical-wiki instance of hardware/software codesign at the LLM-serving layer.
Seen in¶
- sources/2025-11-04-google-exploring-space-based-scalable-ai-infrastructure — named as the compute element inside each Project Suncatcher satellite; framing is substrate of record for the moonshot, not serving-infra optimisation.
- sources/2026-05-28-google-a-new-era-of-innovation-google-research-at-io-2026 — named as the production LLM-serving accelerator for Gemini 3.5 Flash; speculative-decoding extensions co-designed with TPU architecture; "substantially faster responses with no loss in quality" attributed to the codesign. No TPU-generation identification, no kernel-shape decomposition, no pod-size numbers in the raw.
Related¶
- systems/project-suncatcher — orbital-constellation moonshot that integrates TPUs as the per-satellite AI accelerator.
- systems/gemini-3-5-flash — production LLM whose serving speed is attributed to TPU-codesigned speculative-decoding extensions.
- systems/gemini — parent model family.
- concepts/speculative-decoding — base inference technique deployed on TPU.
- concepts/block-verification — TPU-codesigned verifier-side extension.
- concepts/tree-structured-drafting — TPU-codesigned drafter-side extension.
- patterns/hardware-software-codesign-for-ml-serving — the codesign framing.
- companies/google — Google Research / Google Cloud, Tier 1 source.