Skip to content

SYSTEM Cited by 8 sources

Firecracker

Firecracker is AWS's open-source KVM-based micro-VM monitor, used under AWS Lambda (and Fargate) to run many tenants densely on shared bare metal while preserving hardware-level isolation.

Why Lambda needed it

Lambda launched (Nov 2014) with a hard rule: "security is not negotiable — no two customers share an instance." To enforce that, each customer got single-tenant EC2 instances. This was expensive but the team "knew long-term that it was a problem we could solve." Firecracker is the system that solved it: the same multi-tenant security property via micro-VMs, but with "thousands of micro VMs onto a single bare metal instance."

(Source: sources/2024-11-15-allthingsdistributed-aws-lambda-prfaq-after-10-years)

Architectural role

  • Provides hardware-virtualisation isolation per function invocation context — stronger than a container, lighter than a full VM.
  • Enables dense multi-tenant packing on bare-metal hosts, which is what makes Lambda's placement engine able to honour the scale-to-zero / per-ms billing model without idle-capacity waste. See concepts/micro-vm-isolation, concepts/scale-to-zero.
  • Underpins SnapStart (2022) — Firecracker VM snapshotting restores an initialized runtime near-instantaneously, cutting Java cold-start latency by up to 90%. See concepts/cold-start.
  • Co-evolves with the on-demand container loading work (Marc Brooker, USENIX ATC '23) that lets Lambda pull 10 GB container images without cold-start blowup.

Seen in

  • sources/2024-11-15-allthingsdistributed-aws-lambda-prfaq-after-10-years — referenced as the isolation-density evolution from launch single-tenant EC2 to today's multi-tenant micro-VM fleet; enables SnapStart.
  • sources/2026-04-21-figma-server-side-sandboxing-virtual-machines — production tenant example: Figma uses AWS Lambda (backed by Firecracker) as its VM-grade sandbox for stateless fetch-and-process workloads (link-preview metadata / canvas image fetch via ImageMagick), deliberately placed outside the production VPC with no IAM pivot into Figma internals (patterns/minimize-vm-permissions). Also names Firecracker's boot overhead as the reason AWS reuses Lambda VMs within a single tenant for synchronous workloads — "Firecracker offers reasonably quick VM boot times, but the overheads are still too high to pay on many core workflows" — a concrete latency/isolation trade-off at the Lambda-customer level.
  • sources/2024-03-07-flyio-fly-kubernetes-does-more-now — Firecracker as the Pod substrate under Fly Kubernetes: every K8s Pod is a Fly Machine (Firecracker micro-VM), orchestrated by flyd rather than containerd / runc. Canonical wiki instance of concepts/micro-vm-as-pod at the K8s API level — distinct from Lambda's use of Firecracker at the serverless-function level. Fly frames it as "our system transmogrifies Docker containers into Firecracker microVMs".
  • sources/2024-02-15-flyio-globally-distributed-object-storage-with-tigris — contextual reference only: "we transmute containers into VMs, running them on our hardware around the world with the power of Firecracker alchemy". Fly.io self-identifies its compute substrate as Firecracker-based while pitching the Tigris object-storage partnership; Firecracker itself is not central to Tigris's three- layer architecture (FoundationDB metadata + NVMe byte cache + QuiCK-style queue).
  • sources/2024-06-19-flyio-aws-without-access-keys — contextual substrate reference: Firecracker micro-VM is what Fly init runs inside and what Fly Machine instances are. The OIDC-federation post treats Firecracker as existing infrastructure and doesn't expose hypervisor-level detail; relevance is that the Macaroon-scoped-per-Machine identity model depends on Firecracker-grade isolation (a container-escape would let one Machine's Macaroon escape to another's workload).
  • sources/2024-08-15-flyio-were-cutting-l40s-prices-in-halfnegative-space datum on GPU-in-micro-VM. Fly.io tried to surface fractional-GPU slicing (NVIDIA MIG + vGPUs) inside Firecracker Machines via IOMMU PCI passthrough and abandoned the effort after "a whole quarter""a project so cursed that Thomas has forsworn ever programming again." Reasons-why are not disclosed, but the datum itself is load-bearing: IOMMU-passthrough-based fractional-GPU virtualisation on Firecracker is not a turnkey path even for a Firecracker-native platform. Fly.io pivoted to whole-GPU (A10 / L40S / A100 / H100) attachment per-Machine, which is the path currently productised.

  • sources/2025-02-14-flyio-we-were-wrong-about-gpusFly.io's 2025-02 GPU retrospective clarifies the Firecracker-vs-Cloud Hypervisor split. Non-GPU Fly Machines run on Firecracker; GPU Fly Machines run on Cloud Hypervisor (a "very similar Rust codebase" that supports PCI passthrough). Firecracker's minimal device model — the property that gives it fast-boot + small attack surface — is also why it can't host GPU passthrough; Fly had to pick a sibling VMM for the peripheral-accelerator path. This is the cleanest public disclosure of Firecracker's positioning relative to its PCI-passthrough-capable cousin. The 2025-02 post also elaborates on the 2024-08 MIG/vGPU failure datum: Fly "burned months trying (and ultimately failing) to get Nvidia's host drivers working to map virtualized GPUs into Intel Cloud Hypervisor. At one point, we hex-edited the closed-source drivers to trick them into thinking our hypervisor was QEMU" — Firecracker wasn't the integration target, but the underlying micro-VM posture is what put Fly off Nvidia's driver happy-path in the first place. Confirms fast-boot DX as the non-negotiable product requirement that forced the off-path choice ("we could not have offered our desired Developer Experience on the Nvidia happy-path").

  • sources/2025-06-20-flyio-phoenixnew-remote-ai-runtime-for-phoenixFirecracker as the per-session cloud-IDE perimeter for Phoenix.new. Every Phoenix.new browser session is backed by a fresh Fly Machine (Firecracker micro-VM) that the user and a coding agent share as co-tenants with root access. The safety posture (concepts/agent-with-root-shell) depends on Firecracker's KVM isolation: the agent has full freedom inside the guest precisely because the guest is a disposable VM whose boundary is enforced by the hypervisor. Same ephemeral-VM-as-agent- sandbox property as patterns/disposable-vm-for-agentic-loop (the 2025-02-07 sketch) — now productised as the default Phoenix.new session shape (patterns/ephemeral-vm-as-cloud-ide). Firecracker's fast-boot property (concepts/fast-vm-boot-dx) is the precondition that makes per-session VMs viable as a product shape; slower hypervisors would force session reuse and reintroduce environment drift.
Last updated · 200 distilled / 1,178 read