Skip to content

CONCEPT Cited by 1 source

Micro-VM as Pod

Micro-VM as Pod is the architectural stance of making the Kubernetes Pod a thin compatibility layer over a micro-VM (typically Firecracker) rather than over a Linux container on a shared kernel. Containers inside the Pod share the micro-VM's kernel; different Pods get different kernels.

Definition

In the reference Kubernetes deployment model, a Pod is one-or-more containers that share a Linux network + IPC + UTS namespace on a worker Node. The container runtime (containerd, CRI-O) uses runc or similar to carve a shared-kernel sandbox.

In the micro-VM-as-Pod model:

  • Each Pod is a discrete micro-VM, with its own kernel.
  • The Pod boundary is a hardware-virtualisation isolation boundary (VT-x, EPT), not a namespace boundary.
  • Multi-container Pods collapse to single-container Pods until the platform chooses to implement a multi-container VM shape.
  • Kubelet-typical responsibilities (image pull, rootfs assembly, probes, lifecycle) are handled by the host's micro-VM orchestrator rather than by kubelet + containerd + runc.

Why you'd do it

  • Stronger isolation — a bug in a container runtime or in the Linux kernel can't cross a hypervisor boundary as easily. Same argument AWS Lambda used to move from single-tenant EC2 to Firecracker (sources/2024-11-15-allthingsdistributed-aws-lambda-prfaq-after-10-years).
  • Dense multi-tenant packing — micro-VMs are light enough that providers can densely pack them onto bare metal while preserving hardware-level isolation. See concepts/micro-vm-isolation.
  • Unified compute substrate — a provider with an existing micro-VM primitive (Fly Machines, Firecracker-on-Fargate, AWS Lambda) can expose K8s semantics on top of it without building a second runtime path.

Canonical production instances

Trade-offs

  • ❌ Cold-start latency is non-trivial (Firecracker is fast, but boot + rootfs + userland init together can still outweigh a per- request budget — Figma calls this out explicitly at the Lambda level).
  • ❌ Multi-container Pods don't map cleanly; most implementations ship without sidecars / init-containers at first (FKS is explicit about this at beta).
  • ❌ Some Kubelet APIs (kubectl exec, kubectl port-forward) require the VK provider to implement stdio / port-forwarding proxies; often missing at launch.
  • ✅ Isolation + density + tenant-boundary story is stronger than a shared-kernel Node.

Seen in

Last updated · 200 distilled / 1,178 read