CONCEPT Cited by 1 source
Radiation effects on computing¶
Radiation effects on computing is the failure-mode class introduced when silicon operates in an environment with meaningfully higher ionising-radiation flux than terrestrial sea level — most sharply, space. Terrestrial ECC and retry primitives were sized for terrestrial error rates and are not trivially portable upward.
Failure-mode classes¶
- Single-event upsets (SEU). A single ionising particle flips a bit in memory, a register, or a combinational logic node. At low altitude this is rare enough to be absorbed by ECC; in orbit the rate is high enough to matter at the architectural layer.
- Single-event transients (SET). A particle causes a transient voltage glitch that may or may not latch into a logical error, depending on timing.
- Single-event latchup (SEL). A particle triggers a parasitic thyristor in CMOS silicon that can destroy the device if not cleared by power cycle.
- Total-ionising-dose (TID). Cumulative radiation exposure progressively degrades transistor parameters (threshold voltage, leakage), causing device-level drift and eventually hard failure.
- Displacement damage. Non-ionising particles (primarily neutrons and heavy ions) knock atoms out of the crystal lattice, degrading especially analogue and opto-electronic components.
Why it matters for space-based AI infrastructure¶
Historically, spacecraft silicon has been radiation-hardened — special processes, thicker gate oxides, triple-modular-redundancy logic, SOI substrates — which buy tolerance but lag the commercial-silicon process-node curve by ~2 generations and carry non-trivial perf/watt penalty.
Project Suncatcher makes a load-bearing opposite choice: carry commercial commodity TPUs into orbit rather than purpose-built radiation-hardened space ASICs. The benefit is that the constellation rides the mainline commercial-compute density curve; the cost is that radiation tolerance must be solved at the architectural / software layer — error-correction, redundancy, workload placement, voting, checkpointing, and similar patterns at the distributed- systems layer rather than at the silicon layer.
The Google Research 2025-11-04 announcement names radiation effects as one of three foundational-research challenges for the Suncatcher programme, sitting alongside FSO link bandwidth and orbital dynamics:
"Early research… describes our progress toward tackling the foundational challenges of this ambitious endeavor — including high-bandwidth communication between satellites, orbital dynamics, and radiation effects on computing." — sources/2025-11-04-google-exploring-space-based-scalable-ai-infrastructure
The raw announcement does not enumerate which specific mitigations Suncatcher adopts — those live in the preprint paper.
Terrestrial adjacent relevance¶
- High-altitude / aviation compute sees elevated SEU rates vs. sea level.
- High-reliability terrestrial systems (storage, financial ledger, memory- intensive analytics at scale) already budget for soft-error-rate statistics — space is the same problem amplified.
- Memory ECC and retry are the baseline mitigation but are sized for terrestrial rates; space requires both wider ECC and systemic architectural redundancy.
The concept therefore bridges naturally from the space-systems literature to terrestrial high-reliability architectures, not just as a niche concern.
Seen in¶
- sources/2025-11-04-google-exploring-space-based-scalable-ai-infrastructure — named as one of three foundational-research challenges for Project Suncatcher's commodity-TPU space constellation.
Related¶
- systems/project-suncatcher — Google's space-based-compute moonshot that pushes radiation-tolerance into architectural/software mitigation rather than silicon mitigation.
- concepts/space-based-compute — the parent architectural class.
- systems/google-tpu — the commodity accelerator whose radiation tolerance is the load-bearing open research question for Suncatcher.
- patterns/modular-disaggregated-constellation — the architectural- shape pattern whose many small interconnected units shape enables per-unit redundancy / voting as one radiation-mitigation axis.