Skip to content

CONCEPT Cited by 1 source

Hardware-isolated micro-VM on Kubernetes

Definition

Compose Kubernetes (scheduling, networking, declarative API, lifecycle) with Firecracker (hardware-virtualisation boundary per workload) so that each scheduled "pod" / workload unit is actually a hardware-isolated micro-VM rather than a shared-kernel container. The scheduling / API surface is the K8s one engineers already know; the isolation boundary is VM-level, not namespace- level.

Why compose, not replace

Containers under a shared kernel are a known-weak multi-tenant isolation boundary for hostile / untrusted code — the kernel is a large attack surface, and the classic container escape surface (namespaces, cgroups, seccomp, syscall bugs) is not sufficient when tenants may be actively adversarial or when tenant code is LLM-generated and not audited. Full VMs fix the isolation but lose the density and fast-start properties that make K8s-shaped platforms usable.

Firecracker's niche — "thousands of micro VMs onto a single bare metal instance" (Lambda PR/FAQ) — is the pragmatic resolution: keep the VM boundary, shrink the VM. Composing that with Kubernetes means you can keep the engineering surface (YAML, controllers, services, ingress) and upgrade the isolation boundary (from namespace to hardware) without rewriting the platform API. See concepts/micro-vm-isolation.

The shape, in practice

Three named wiki instances:

  • systems/atlassian-fireworks (2026-04-24) — Atlassian's internal substrate for AI-agent execution. "You submit an OCI container image and a command, and it boots a hardware-isolated Firecracker VM, runs your workload." Built-in scheduler + autoscaler + Raft persistence + Envoy ingress + eBPF network policy. 100 ms warm starts, live migration between hosts. (Source: sources/2026-04-24-atlassian-rovo-dev-driven-development)
  • systems/fly-kubernetes — Fly.io's public-cloud implementation of the same shape. Each K8s pod is backed by a Fly Machine (Firecracker µVM). The K8s API is preserved; the runtime is VM-grade.
  • systems/aws-lambda — the original Firecracker-based dense multi-tenant substrate, though not K8s-API-surface presented. Useful as the proof that the µVM density / isolation trade-off works at public-cloud scale.

Required control-plane components

Building a production instance of this shape requires substantially more than Kubernetes + a µVM monitor. The Fireworks post names:

  • Scheduler — decides which node runs which VM. K8s's default scheduler is not sufficient once the runtime is Firecracker — custom placement logic is needed for binpacking, migration, and anti-affinity at the µVM layer.
  • Autoscaler — provisions nodes to the cluster as µVM demand grows.
  • Node agents — per-host daemons that actually boot / snapshot / migrate µVMs on each K8s node.
  • Envoy ingress layers — L7 proxy in front of the control plane and/or tenant workloads.
  • Raft persistence — replicated control-plane state for scheduler / placement decisions, so the control plane survives single-host loss.
  • eBPF network policy — per-µVM ingress/egress filtering enforced in-kernel on the host. See systems/ebpf.

The list is not incidental — each component exists because the µVM substrate makes one of the "free" K8s primitives insufficient. Custom scheduling compensates for the absence of container-style density assumptions. Node agents replace the kubelet's container-lifecycle assumptions. eBPF network policy replaces CNI's in-kernel-container-namespace assumptions.

Production features unlocked

With a working substrate, production features follow from Firecracker's primitives:

Feature Firecracker primitive
100 ms warm starts Snapshot-based boot; restore a pre-warmed VM
Live migration between hosts VM-level state transfer
Snapshot filesystem restore Firecracker snapshot / fork
Sidecar sandboxes Multi-µVM co-location per workload
Shared volumes Host-mediated volume surface between co-located µVMs
Per-tenant hardware isolation KVM / VMX boundary (not namespace)

Why this matters for AI-agent execution specifically

"It will be the secure execution engine behind Atlassian's AI agent infrastructure." AI agents produce novel code on the fly; platform operators have not audited the code before it runs; tenants are not necessarily trusting of each other; and the blast radius of a compromised agent is the entire platform if isolation is shared-kernel. Hardware isolation (VM-grade) becomes table stakes, and the K8s API + scheduler + ingress surface becomes the engineering-productivity concession that keeps the platform buildable.

This is the threat-model argument for the shape: containers are acceptable when you trust your code; µVMs become non-optional when you don't. See concepts/server-side-sandboxing.

Not the same as

  • Shared-kernel containers. Fundamentally weaker isolation boundary — the shape exists to get past the container boundary.
  • Pure K8s + gVisor / kata-containers. gVisor / Kata are in the same design space but with different trade-offs (user-space kernel vs. hypervisor). Firecracker-on-K8s is the specific Firecracker-as-runtime variant.
  • Public-cloud FaaS (Lambda, Cloud Run). Hardware-isolated µVM yes; K8s API surface no. The point of the composition is the K8s-shaped engineering surface.

Seen in

Last updated · 510 distilled / 1,221 read