Skip to content

CONCEPT Cited by 1 source

Container escape

A container escape is when code inside a container breaks through the isolation boundary and obtains execution or data access on the host (or on sibling containers on the same host). Parallel to concepts/vm-escape — the same concept one layer in, where the boundary being crossed is a kernel one rather than a hypervisor one.

(Source: sources/2026-04-21-figma-server-side-sandboxing-containers-and-seccomp)

Three attack-surface axes

Figma decomposes the container-escape surface into exactly three components:

"Three components — runtime implementation, the OS primitives and interface available to the runtime, and runtime configuration — generally make up the attack surface for a container escape. By default, containers are not automatically secure sandboxes because the level of isolation provided depends very much on these three factors. A kernel vulnerability, a bug in the runtime implementation, and/or a runtime misconfiguration might allow a malicious workload to modify files and execute code on its host."

Axis Attacker-weaponised path Example
Kernel vulnerability Bug in the kernel code that implements namespaces / cgroups / seccomp / SELinux / AppArmor Dirty COW, Dirty Pipe
Runtime implementation bug Bug in systems/runc / systems/docker / containerd CVE-2019-5736 (runC host-binary-replacement)
Runtime misconfiguration Wide-open capability set, privileged container, host-mounted sockets, --net=host Operator choice

Unlike concepts/vm-escape — where the hypervisor attack surface is mostly unmodifiable by users — the container- escape surface includes a configuration axis the operator owns. Most recorded container escapes that aren't kernel-CVE driven are misconfigurations.

Why the kernel-vulnerability axis is bigger than the hypervisor one

Figma's framing is explicit:

"the attack surface of a hypervisor is usually smaller than for an OS kernel, or discuss the number of kernel exploits in recent years that would have allowed a container escape."

The Linux kernel is large, includes many subsystems, and ships many features a given workload never uses. The hypervisor (especially minimal VMMs like Firecracker) is a small piece of code with a tightly-scoped role. Bugs in either rise over time; the rate is higher in the kernel simply because the surface is larger.

gVisor attacks this asymmetry by interposing a reimplemented user-space kernel between the container and the host kernel — container escapes now have to defeat gVisor and the host kernel, at the cost of performance and compatibility.

Configuration axis: the most common escape cause

Misconfiguration shapes that have produced real escapes:

  • Privileged containers (--privileged) — drop almost all isolation; mount tree includes host /proc, /sys.
  • Host network namespace (--net=host) — container sees the host's network stack as its own.
  • Docker socket mounted into a container — container can create sibling containers with arbitrary config, trivially achieving host takeover.
  • Wide capability set — keeping CAP_SYS_ADMIN when the workload doesn't need it.
  • Default seccomp profile disabled — loses the syscall- filter layer entirely.

Modern Docker / Kubernetes defaults are more secure than they were; the operator is still responsible for checking. Figma's framing: "Unlike commodity VM solutions, containers place a much greater responsibility on the user to correctly configure the desired level of isolation. More control over security configuration also means more room to make mistakes."

Defence in depth: don't rely on container-escape resistance alone

Even if the container boundary holds, a compromised workload inside can still:

  • Make outbound network calls (exfiltration) — unless network namespace blocks egress and egress filters exist.
  • Reach other services using the container's credentials.
  • Persist state for the container's lifetime.

So container integrity is necessary but not sufficient. Same shape as concepts/vm-escape's "even if the hypervisor holds" framing. The containment practices Figma recommends:

  • Place the container in its own isolated network — orchestration passes input in and reads output on controlled channels, container has no other reach.
  • No mounted credentials — no cloud-provider instance- profile access, no mounted secrets, no host paths.

Both are application-layer design decisions on top of the container runtime's own isolation.

Seen in

Last updated · 200 distilled / 1,178 read