CONCEPT Cited by 1 source

Container escape¶

A container escape is when code inside a container breaks through the isolation boundary and obtains execution or data access on the host (or on sibling containers on the same host). Parallel to concepts/vm-escape — the same concept one layer in, where the boundary being crossed is a kernel one rather than a hypervisor one.

(Source: sources/2026-04-21-figma-server-side-sandboxing-containers-and-seccomp)

Three attack-surface axes¶

Figma decomposes the container-escape surface into exactly three components:

"Three components — runtime implementation, the OS primitives and interface available to the runtime, and runtime configuration — generally make up the attack surface for a container escape. By default, containers are not automatically secure sandboxes because the level of isolation provided depends very much on these three factors. A kernel vulnerability, a bug in the runtime implementation, and/or a runtime misconfiguration might allow a malicious workload to modify files and execute code on its host."

Axis	Attacker-weaponised path	Example
Kernel vulnerability	Bug in the kernel code that implements namespaces / cgroups / seccomp / SELinux / AppArmor	Dirty COW, Dirty Pipe
Runtime implementation bug	Bug in systems/runc / systems/docker / containerd	CVE-2019-5736 (runC host-binary-replacement)
Runtime misconfiguration	Wide-open capability set, privileged container, host-mounted sockets, `--net=host`	Operator choice

Unlike concepts/vm-escape — where the hypervisor attack surface is mostly unmodifiable by users — the container- escape surface includes a configuration axis the operator owns. Most recorded container escapes that aren't kernel-CVE driven are misconfigurations.

Why the kernel-vulnerability axis is bigger than the hypervisor one¶

Figma's framing is explicit:

"the attack surface of a hypervisor is usually smaller than for an OS kernel, or discuss the number of kernel exploits in recent years that would have allowed a container escape."

The Linux kernel is large, includes many subsystems, and ships many features a given workload never uses. The hypervisor (especially minimal VMMs like Firecracker) is a small piece of code with a tightly-scoped role. Bugs in either rise over time; the rate is higher in the kernel simply because the surface is larger.

gVisor attacks this asymmetry by interposing a reimplemented user-space kernel between the container and the host kernel — container escapes now have to defeat gVisor and the host kernel, at the cost of performance and compatibility.

Configuration axis: the most common escape cause¶

Misconfiguration shapes that have produced real escapes:

Privileged containers (--privileged) — drop almost all isolation; mount tree includes host /proc, /sys.
Host network namespace (--net=host) — container sees the host's network stack as its own.
Docker socket mounted into a container — container can create sibling containers with arbitrary config, trivially achieving host takeover.
Wide capability set — keeping CAP_SYS_ADMIN when the workload doesn't need it.
Default seccomp profile disabled — loses the syscall- filter layer entirely.

Modern Docker / Kubernetes defaults are more secure than they were; the operator is still responsible for checking. Figma's framing: "Unlike commodity VM solutions, containers place a much greater responsibility on the user to correctly configure the desired level of isolation. More control over security configuration also means more room to make mistakes."

Defence in depth: don't rely on container-escape resistance alone¶

Even if the container boundary holds, a compromised workload inside can still:

Make outbound network calls (exfiltration) — unless network namespace blocks egress and egress filters exist.
Reach other services using the container's credentials.
Persist state for the container's lifetime.

So container integrity is necessary but not sufficient. Same shape as concepts/vm-escape's "even if the hypervisor holds" framing. The containment practices Figma recommends:

Place the container in its own isolated network — orchestration passes input in and reads output on controlled channels, container has no other reach.
No mounted credentials — no cloud-provider instance- profile access, no mounted secrets, no host paths.

Both are application-layer design decisions on top of the container runtime's own isolation.

Seen in¶

sources/2026-04-21-figma-server-side-sandboxing-containers-and-seccomp — canonical decomposition of the three attack-surface axes, the hypervisor-vs-kernel surface comparison, and the defence-in-depth framing.