CONCEPT Cited by 2 sources
Hardware offload¶
Definition¶
Hardware offload is the design pattern of moving work that previously ran on general-purpose CPUs under a general-purpose OS/hypervisor onto dedicated hardware pipelines (programmable NICs, custom ASICs, "DPU"-class cards). The goal isn't primarily raw throughput — it's predictability, isolation, and CPU reclamation for the customer workload.
Why AWS uses it¶
Per the EBS retrospective and the Lambda retrospective, AWS treats hardware offload as a queue-reduction and isolation lever:
- Queue reduction. Each kernel/hypervisor layer in the IO path is a queue that can interact badly with other queues. Offloading an entire function (VPC network processing, EBS storage, encryption) onto a dedicated card removes several in-OS queues from the critical path. See concepts/queueing-theory.
- Stop stealing customer CPU. If the host's CPUs are servicing IO interrupts, the guest workload pays for it. Dedicated hardware means guest CPUs are unreserved for guest code.
- Dedicated interrupt processing. Custom silicon can service interrupts at line rate with worst-case bounds that a general-purpose kernel cannot promise.
- Security boundary. Encryption key material can live on the card, isolated from the hypervisor — useful when the threat model includes hypervisor compromise.
- Density unlock. With a lightweight hypervisor + systems/firecracker + the offload stack, bare-metal instances can run thousands of multi-tenant micro-VMs.
Canonical instantiations¶
- systems/nitro — first card. VPC network processing moved out of the Xen dom0 kernel onto a dedicated pipeline. EC2 went from "Xen steals CPU to drive the network" to "network arrives pre-processed."
- Nitro — second card. EBS storage processing + EBS encryption. Hardware-accelerated encryption with keys outside the hypervisor.
- systems/aws-nitro-ssd. Offload extended to the storage media itself — EBS-tuned SSDs co-designed with the data-plane stack.
- systems/srd running on Nitro cards. Transport itself is offloaded so guests and hypervisors don't touch it.
Trade-offs¶
- Less general. A card that handles VPC really well is less useful outside VPC; co-designing is an investment.
- Longer hardware iteration cycles. Software moves faster than silicon. Incrementalism (concepts/incremental-delivery) matters more — ship software improvements while hardware catches up.
- Lock-in. Customers running on Nitro-offloaded services can't trivially take the same performance properties to other clouds.
Seen in¶
- sources/2024-08-22-allthingsdistributed-continuous-reinvention-block-storage-at-aws — offload described as a queue-reduction strategy first, a perf strategy second.
- sources/2024-11-15-allthingsdistributed-aws-lambda-prfaq-after-10-years — offload as the density enabler for Lambda's multi-tenant micro-VM fleet.