CONCEPT Cited by 1 source
Container escape¶
A container escape is when code inside a container breaks through the isolation boundary and obtains execution or data access on the host (or on sibling containers on the same host). Parallel to concepts/vm-escape — the same concept one layer in, where the boundary being crossed is a kernel one rather than a hypervisor one.
(Source: sources/2026-04-21-figma-server-side-sandboxing-containers-and-seccomp)
Three attack-surface axes¶
Figma decomposes the container-escape surface into exactly three components:
"Three components — runtime implementation, the OS primitives and interface available to the runtime, and runtime configuration — generally make up the attack surface for a container escape. By default, containers are not automatically secure sandboxes because the level of isolation provided depends very much on these three factors. A kernel vulnerability, a bug in the runtime implementation, and/or a runtime misconfiguration might allow a malicious workload to modify files and execute code on its host."
| Axis | Attacker-weaponised path | Example |
|---|---|---|
| Kernel vulnerability | Bug in the kernel code that implements namespaces / cgroups / seccomp / SELinux / AppArmor | Dirty COW, Dirty Pipe |
| Runtime implementation bug | Bug in systems/runc / systems/docker / containerd | CVE-2019-5736 (runC host-binary-replacement) |
| Runtime misconfiguration | Wide-open capability set, privileged container, host-mounted sockets, --net=host |
Operator choice |
Unlike concepts/vm-escape — where the hypervisor attack surface is mostly unmodifiable by users — the container- escape surface includes a configuration axis the operator owns. Most recorded container escapes that aren't kernel-CVE driven are misconfigurations.
Why the kernel-vulnerability axis is bigger than the hypervisor one¶
Figma's framing is explicit:
"the attack surface of a hypervisor is usually smaller than for an OS kernel, or discuss the number of kernel exploits in recent years that would have allowed a container escape."
The Linux kernel is large, includes many subsystems, and ships many features a given workload never uses. The hypervisor (especially minimal VMMs like Firecracker) is a small piece of code with a tightly-scoped role. Bugs in either rise over time; the rate is higher in the kernel simply because the surface is larger.
gVisor attacks this asymmetry by interposing a reimplemented user-space kernel between the container and the host kernel — container escapes now have to defeat gVisor and the host kernel, at the cost of performance and compatibility.
Configuration axis: the most common escape cause¶
Misconfiguration shapes that have produced real escapes:
- Privileged containers (
--privileged) — drop almost all isolation; mount tree includes host/proc,/sys. - Host network namespace (
--net=host) — container sees the host's network stack as its own. - Docker socket mounted into a container — container can create sibling containers with arbitrary config, trivially achieving host takeover.
- Wide capability set — keeping
CAP_SYS_ADMINwhen the workload doesn't need it. - Default seccomp profile disabled — loses the syscall- filter layer entirely.
Modern Docker / Kubernetes defaults are more secure than they were; the operator is still responsible for checking. Figma's framing: "Unlike commodity VM solutions, containers place a much greater responsibility on the user to correctly configure the desired level of isolation. More control over security configuration also means more room to make mistakes."
Defence in depth: don't rely on container-escape resistance alone¶
Even if the container boundary holds, a compromised workload inside can still:
- Make outbound network calls (exfiltration) — unless network namespace blocks egress and egress filters exist.
- Reach other services using the container's credentials.
- Persist state for the container's lifetime.
So container integrity is necessary but not sufficient. Same shape as concepts/vm-escape's "even if the hypervisor holds" framing. The containment practices Figma recommends:
- Place the container in its own isolated network — orchestration passes input in and reads output on controlled channels, container has no other reach.
- No mounted credentials — no cloud-provider instance- profile access, no mounted secrets, no host paths.
Both are application-layer design decisions on top of the container runtime's own isolation.
Seen in¶
- sources/2026-04-21-figma-server-side-sandboxing-containers-and-seccomp — canonical decomposition of the three attack-surface axes, the hypervisor-vs-kernel surface comparison, and the defence-in-depth framing.