Skip to content

SYSTEM Cited by 1 source

nsjail

nsjail (github.com/google/nsjail) is a Google-originated open-source command-line tool that stacks five Linux isolation primitives — namespaces, capabilities, filesystem restrictions, cgroups / resource limits, and seccomp — into a single process-launcher. It is the canonical example of layered composition of containerisation primitives with seccomp-bpf syscall filtering: "seccomp can be combined with containerization to provide robust, multilayered sandbox-focused systems." (Source: sources/2026-04-21-figma-server-side-sandboxing-containers-and-seccomp)

Use case: Figma RenderServer

At Figma, nsjail is the production sandbox for RenderServer — a C++ server version of the Figma editor used to produce thumbnails / convert Figma files to images / SVGs. "At Figma, we now use nsjail for use cases where container-level security isolation is appropriate."

Figma explicitly chose nsjail over Docker as a drop-in solution:

"we would need to create a new service that sandboxes the RenderServer binary inside a secure Docker configuration, create an orchestration system to manage the service, and re-architect various services to make a network call to the RenderServer service instead of invoking the binary directly. A separate service might be a reasonable long-term investment but didn't allow us the flexibility to explore different options based on evolving needs. Instead, we adopted nsjail as a drop-in solution so that we could focus our efforts on securely configuring it for our needs."

Per user request, nsjail starts a new RenderServer process in:

  • New user, pid, mount, and network namespaces (concepts/linux-namespaces) — no host-visible identity, no visible processes, no host filesystem, no network.
  • No network access — a stronger statement than "restricted egress"; the process cannot open a socket.
  • Specific filesystem mount points only — input file, libraries, output folder.
  • seccomp-bpf enforcing a strict syscall allowlist (concepts/seccomp).

This composition means the RenderServer process has no path to the outside world short of a kernel bug in one of the five primitives — defence-in-depth at the kernel-primitive layer.

Startup cost

Disclosed latency profile (Figma, 2026):

  • Typical: small fractions of a second — tens to low hundreds of milliseconds.
  • Long tail: "there is, however, still a long tail of startup times."
  • Language runtime init: can take "substantially longer" — the sandbox is fast but the workload it starts may not be.

Orders of magnitude below a full VM cold-start, orders of magnitude above a seccomp-only sandbox on an already-warm process.

Configuration foot-guns encountered in production

One documented Figma-rollout surprise:

  • rlimit_fsize default is 1 MB. On initial deployment, large-image inputs produced "output files that were exactly 1 MB in size. Very suspicious!" Root cause: nsjail's default file-size resource limit; fix was a one-line config change after "we had not read the nsjail documentation carefully enough."

  • Seccomp allowlist needed several iterations during rollout. "We hit very rare codepaths in the complex RenderServer codebase — which serves user traffic at large scale — that we didn't encounter during testing or internal use." Seccomp violations kill the process with minimal context (kernel log names the failing syscall, nothing more), so each failure required a round of investigation.

Positioning

  • vs Docker: same underlying primitives (namespaces / cgroups / seccomp / MAC), but nsjail is a per-invocation launcher rather than an image-based container-orchestration platform. No image, no daemon, no service-reorchestration tax — invoke in-process as a child, get the isolation.
  • vs firejail: same layered-composition design; firejail is SUID-based (runs as root-setuid from user context), nsjail is command-line-driven and more common in server-side sandboxing.
  • vs gVisor: gVisor interposes an entire userspace-reimplemented kernel between the container and the host kernel to shrink attack surface further. nsjail does not — it relies on the Linux kernel's own correctness.
  • vs seccomp-only (concepts/syscall-allowlist): nsjail gives you defence in depth (five primitives); seccomp-only is lighter and faster, but you must either (a) accept coarser allowlists, or (b) rewrite your program so the seccomp filter can land at a sharper point in the execution.

Seen in

Last updated · 200 distilled / 1,178 read