SYSTEM Cited by 1 source
nsjail¶
nsjail (github.com/google/nsjail) is a Google-originated open-source command-line tool that stacks five Linux isolation primitives — namespaces, capabilities, filesystem restrictions, cgroups / resource limits, and seccomp — into a single process-launcher. It is the canonical example of layered composition of containerisation primitives with seccomp-bpf syscall filtering: "seccomp can be combined with containerization to provide robust, multilayered sandbox-focused systems." (Source: sources/2026-04-21-figma-server-side-sandboxing-containers-and-seccomp)
Use case: Figma RenderServer¶
At Figma, nsjail is the production sandbox for RenderServer — a C++ server version of the Figma editor used to produce thumbnails / convert Figma files to images / SVGs. "At Figma, we now use nsjail for use cases where container-level security isolation is appropriate."
Figma explicitly chose nsjail over Docker as a drop-in solution:
"we would need to create a new service that sandboxes the RenderServer binary inside a secure Docker configuration, create an orchestration system to manage the service, and re-architect various services to make a network call to the RenderServer service instead of invoking the binary directly. A separate service might be a reasonable long-term investment but didn't allow us the flexibility to explore different options based on evolving needs. Instead, we adopted nsjail as a drop-in solution so that we could focus our efforts on securely configuring it for our needs."
Per user request, nsjail starts a new RenderServer process in:
- New user, pid, mount, and network namespaces (concepts/linux-namespaces) — no host-visible identity, no visible processes, no host filesystem, no network.
- No network access — a stronger statement than "restricted egress"; the process cannot open a socket.
- Specific filesystem mount points only — input file, libraries, output folder.
- seccomp-bpf enforcing a strict syscall allowlist (concepts/seccomp).
This composition means the RenderServer process has no path to the outside world short of a kernel bug in one of the five primitives — defence-in-depth at the kernel-primitive layer.
Startup cost¶
Disclosed latency profile (Figma, 2026):
- Typical: small fractions of a second — tens to low hundreds of milliseconds.
- Long tail: "there is, however, still a long tail of startup times."
- Language runtime init: can take "substantially longer" — the sandbox is fast but the workload it starts may not be.
Orders of magnitude below a full VM cold-start, orders of magnitude above a seccomp-only sandbox on an already-warm process.
Configuration foot-guns encountered in production¶
One documented Figma-rollout surprise:
-
rlimit_fsizedefault is 1 MB. On initial deployment, large-image inputs produced "output files that were exactly 1 MB in size. Very suspicious!" Root cause: nsjail's default file-size resource limit; fix was a one-line config change after "we had not read the nsjail documentation carefully enough." -
Seccomp allowlist needed several iterations during rollout. "We hit very rare codepaths in the complex RenderServer codebase — which serves user traffic at large scale — that we didn't encounter during testing or internal use." Seccomp violations kill the process with minimal context (kernel log names the failing syscall, nothing more), so each failure required a round of investigation.
Positioning¶
- vs Docker: same underlying primitives (namespaces / cgroups / seccomp / MAC), but nsjail is a per-invocation launcher rather than an image-based container-orchestration platform. No image, no daemon, no service-reorchestration tax — invoke in-process as a child, get the isolation.
- vs firejail: same layered-composition design; firejail is SUID-based (runs as root-setuid from user context), nsjail is command-line-driven and more common in server-side sandboxing.
- vs gVisor: gVisor interposes an entire userspace-reimplemented kernel between the container and the host kernel to shrink attack surface further. nsjail does not — it relies on the Linux kernel's own correctness.
- vs seccomp-only (concepts/syscall-allowlist): nsjail gives you defence in depth (five primitives); seccomp-only is lighter and faster, but you must either (a) accept coarser allowlists, or (b) rewrite your program so the seccomp filter can land at a sharper point in the execution.
Seen in¶
- sources/2026-04-21-figma-server-side-sandboxing-containers-and-seccomp — Figma's production sandbox for RenderServer; drop-in alternative to Docker; canonical layered-composition example.