Skip to content

PATTERN Cited by 1 source

Rule engine with continuous policy deploy

Intent

Run a rule engine whose rules (policies) are so hot — new threats appearing, new feature flags, new compliance constraints — that the git repository of policies is the source of truth for what is running on the fleet, right now. "At all times, the source code in the repository is the code running in Sigma" (sources/2015-06-26-meta-fighting-spam-with-haskell).

This is a step beyond "continuous delivery": the baseline for policy-change → production is minutes, not hours, and nothing — no staging soak, no release train — is allowed to slow that down.

Context

  • Adversarial domains. Anti-abuse (spam, phishing, malware), fraud detection, bot management — the population you're ruling against is actively responding to your rules, so rules age fast.
  • High-volume rule sets. Many policies (dozens to thousands), authored by many engineers, needing to co-exist safely in one engine.
  • Rule authors are domain experts, not infra engineers. They care about spam patterns, not about the runtime.

Mechanism

  1. Language with safety gates at repo ingress. The policy language is expressive enough to state real-world rules and constrained enough to prevent one policy from breaking another. Canonical wiki realisation: a purely functional strongly typed language with a type-correct-or-rejected gate at commit: "we don't allow code to be checked into the repository unless it is type-correct."
  2. Hot-code swap of compiled policies. The engine loads new compiled code into the running process, routes new requests to it, lets in-flight requests finish on the old code, and unloads the old code when no request references it. See concepts/hot-code-swapping.
  3. Performance comparable to baseline C++. Any noticeable latency regression from language choice forces perf-critical rules into a lower-layer (often C++) codebase, which defeats the pattern — low-layer code changes follow the slow deploy cadence of the surrounding infra, not the fast cadence of policies. Meta made this requirement explicit in the Sigma rewrite criteria.
  4. Interactive development against production data. Rule authors iterate on real inputs before committing. A customised REPL (GHCi for Sigma) that loads in seconds and has access to production data sources is load-bearing; without it, rules have to be "tested in production" which defeats the safety posture.

Canonical wiki instance: Meta's Sigma

Meta's Sigma anti-abuse rule engine is the canonical wiki implementation of this pattern. Pre-2015 Sigma ran the in-house DSL FXL; the rewrite to Haskell plus Meta-built Haxl plus Meta-contributed GHC extensions (GC-assisted code unload, per-thread allocation limits, Applicative do-notation) took the operational posture from "reasonable" to "1M+ rps with minutes-to-deploy".

Post-rewrite numbers:

  • >1M rps production throughput.
  • 20–30% overall throughput improvement over FXL; up to on individual request types.
  • "The source code in the repository is the code running in Sigma."

Preconditions

  • Short-lived requests (makes hot-swap tractable — no in-flight-migration problem).
  • Persistent-state code is rarely if ever changed in a policy push (so hot-swap only replaces stateless rule code).
  • The engine process is the correct blast radius unit — one broken policy should not affect peers on the same process; see concepts/allocation-limit for the Meta-GHC realisation, or per-policy sandbox / process-isolation variants in other stacks.

Variations

  • Dedicated DSL with hot-reload. Cloudflare Rulesets is a conceptual cousin at a different layer: the language is a smaller DSL (not a general-purpose language) and the hot-reload mechanism is config distribution, not compiled-code swap. Trade-off: DSLs cap expressivity; full languages require compiler investment.
  • Feature flags (concepts/feature-flag) are a degenerate form of this pattern where the "rule" is a boolean per flag; continuous deploy is flag-config push rather than compiled-code swap. The operational posture is the same — the repo is the source of truth — but the language ceiling is much lower.
  • OPA / Cedar / policy-as-data systems (concepts/policy-as-data) sit between flags and full Sigma: richer language than flags, constrained / analysable language unlike general-purpose Haskell, policy-as-data rather than compiled-code distribution.

Anti-patterns it rules out

  • Stage-and-promote for every policy change. If every spam rule has to wait for a weekly release train, the spammer wins. This pattern explicitly trades some blast-radius control (in favour of language / runtime safety) for deploy velocity.
  • Perf-critical rule logic in a lower layer. If some rules must be written in C++ because the policy language is too slow, those rules are trapped at the deploy cadence of C++ — which is the fragmentation FXL's performance cap created, and the reason Meta rewrote.

Seen in

Last updated · 319 distilled / 1,201 read