PATTERN Cited by 1 source
Sandboxed domain-specific expression language¶
Summary¶
When users need to inject logic into a shared process (workflow orchestrator, admission controller, policy engine, configuration evaluator), build a domain-specific subset of a familiar language, bound its runtime behaviour with structural limits (loop iterations, array size, memory), and run evaluation inside a platform sandbox that denies dangerous capabilities.
Canonical wiki instance: Netflix SEL, used inside Maestro for user-injected expressions in parameterized workflows (Source: sources/2024-07-22-netflix-maestro-netflixs-workflow-orchestrator).
Problem¶
Multi-tenant control-plane services often need to evaluate tenant-supplied logic inline:
- A workflow orchestrator evaluates conditional-branch conditions, foreach ranges, signal-matcher predicates.
- An admission controller evaluates per-resource validation rules.
- A policy engine evaluates authorisation decisions against policy documents.
- A configuration system evaluates template expressions to compute values at deploy time.
The obvious approach — embed a general-purpose interpreter (Groovy, Python, JavaScript) — fails three ways:
- Availability — an infinite loop in one tenant's expression can stall the shared process. An unbounded array allocation OOMs the whole server.
- Security — general interpreters expose reflection, filesystem access, class loading, arbitrary syscalls. Each is a potential escape to the host.
- Reasoning — general languages are hard to reason about statically; you can't tell from reading an expression whether it'll terminate, what resources it'll consume, or what state it'll touch.
Maestro's explicit enumeration of the threat:
"Users might unintentionally write an infinite loop that creates an array and appends items to it, eventually crashing the server with out-of-memory (OOM) issues." (Source: sources/2024-07-22-netflix-maestro-netflixs-workflow-orchestrator)
Solution¶
Three-layer defence:
Layer 1 — Language subset¶
Pick a familiar host language (JLS, Go, Python) and define a subset that excludes dangerous constructs (unbounded recursion, reflection, unrestricted loops) while retaining the expressive power needed for the domain. SEL is a JLS subset focused on Maestro parameter types + datetime + predefined utility methods.
Layer 2 — Runtime limits¶
Enforce bounds in the language runtime itself, not via caller checks:
- Loop-iteration limit — caps the total iterations any single expression can execute.
- Array-size limit — caps collection growth.
- Object memory limit — caps total evaluation memory.
SEL quote: "additional runtime checks, such as loop iteration limits, array size checks, object memory size limits and so on, to enhance security and reliability."
Layer 3 — Platform sandbox¶
Even a structurally-safe language running on a general VM can escape via platform capabilities (reflection, class loading, FS, net). Run evaluation inside a capability-restricted sandbox:
- Java Security Manager (SEL's choice).
- Go runtime without
unsafe/ filesystem access (Rego, CEL). - V8 isolate without Node APIs (serverless JS runtimes).
"It leverages the Java Security Manager to restrict access, ensuring a secure and controlled environment for code execution." (Source: sources/2024-07-22-netflix-maestro-netflixs-workflow-orchestrator)
Tradeoffs¶
| Axis | Gain | Cost |
|---|---|---|
| Safety | Tenant expressions cannot crash the process | Users can't use JLS features outside the subset |
| Auditability | Expressions are statically analysable | One-time language + interpreter build cost |
| Performance | Bounded evaluation = predictable latency | Harder to optimise than native code |
| Maintenance | Subset is stable; no churn | Maintainers must keep subset current with host-language evolution |
| Learning | Familiar syntax; minimal friction | Users occasionally hit the subset boundary and must refactor |
Structural variants¶
| Variant | Host language | Example |
|---|---|---|
| Orchestrator expressions | JLS subset | SEL |
| Admission controller rules | Proto-lang subset | Kubernetes CEL |
| Authorisation policies | Datalog-inspired | OpenPolicyAgent Rego |
| Template expressions | Shell / JSON | AWS CloudFormation intrinsics, Jinja-safe mode |
| Serverless plugins | V8 isolate | Cloudflare Workers, Fastly Compute |
When not to use this pattern¶
- Full tenant containers — if tenant logic runs in its own isolated container / VM, you don't need a safe DSL; the container IS the sandbox. Temporal's approach: tenant workflow code runs in the tenant's own worker process.
- Compile-ahead static configuration — if tenant input can be compiled ahead of time into data (not code), you avoid dynamic evaluation entirely.
- Small expression surface — if users only need arithmetic / boolean / comparison ops, a tiny custom grammar is simpler than language-subset engineering.
Example (SEL)¶
User-supplied parameter expression inside a Maestro workflow:
// Compute next partition to backfill from a signal's timestamp
partition = dateAdd(signal.processed_date, days=-1).format("yyyy-MM-dd")
SEL parses this, validates the syntax tree against the JLS subset
(no reflection, bounded iteration), runs it inside the Java
Security Manager sandbox, and returns partition as a Maestro
parameter for downstream steps to consume.
Seen in¶
- sources/2024-07-22-netflix-maestro-netflixs-workflow-orchestrator — SEL as canonical instance