Skip to content

PATTERN Cited by 1 source

Embedded functional runtime in a C++ service

Intent

Get the safety + concurrency model of a managed functional runtime (type safety, purity-derived isolation, implicit-concurrent-data- fetching, hot-code swap, etc.) exactly where it's needed — the request-evaluation layer — without rewriting the surrounding service ecosystem (transport, client libraries, observability).

Architecture

A three-layer sandwich:

  ┌──────────────────────────────────────────┐
  │  C++ Thrift (or equivalent) server       │  mature, performant
  ├──────────────────────────────────────────┤
  │  Functional runtime (Haskell / OCaml /   │  policies / rules
  │  Erlang / Haxl on GHC / ...)             │  live here
  ├──────────────────────────────────────────┤
  │  Existing C++ client libraries for       │  no rewrite; wrapped
  │  internal services, wrapped as data      │  as functional
  │  sources via FFI                         │  data sources
  └──────────────────────────────────────────┘

The outer C++ layers are kept as-is. The middle layer is where the language property you actually want (purity, type safety, implicit-batching concurrency, hot-swap) earns its keep.

Canonical wiki instance: Meta Sigma

Meta's Sigma anti-abuse rule engine realises this pattern with Haskell in the middle (sources/2015-06-26-meta-fighting-spam-with-haskell):

Haskell is sandwiched between two layers of C++ in Sigma. At the top, we use the C++ thrift server. In principle, Haskell can act as a thrift server, but the C++ thrift server is more mature and performant. It also supports more features. Furthermore, it can work seamlessly with the Haskell layers below because we can call into Haskell from C++. For these reasons, it made sense to use C++ for the server layer.

At the lowest layer, we have existing C++ client code for talking to other internal services. Rather than rewrite this code in Haskell, which would duplicate the functionality and create an additional maintenance burden, we wrapped each C++ client in a Haxl data source using Haskell's Foreign Function Interface (FFI) so we could use it from Haskell.

Concrete FFI detail: Haskell's FFI is designed to call C, not C++. Meta used a compile-time C++ name-demangler to avoid intermediate C shims for most calls.

Why sandwich rather than rewrite

  • Transport layer (Thrift server): already mature, feature-rich, operationally understood. Rewriting in Haskell risks losing features + invites operational regressions in exchange for no business-logic win.
  • Client libraries: already exist for every backend service the engine consumes; rewriting in Haskell duplicates functionality and creates an additional maintenance burden.
  • Middle layer (policies): this is where you actually need the language property — safety, implicit concurrency, hot-swap, fast iteration. Rewriting everything else to chase symmetry would defeat the point.

When to use

  • You have a large C++ (or Go, or Java) service ecosystem and you want to introduce a managed-runtime language for a specific high-value layer (rule engine, policy evaluator, feature extractor, expression interpreter).
  • The managed-runtime language has an FFI credible enough to wrap existing client code.
  • The surrounding transport / client infrastructure is mature and expensive to rewrite.

When not to use

  • Greenfield service with no existing C++ ecosystem — a full-stack single-language implementation has fewer moving parts.
  • FFI cost exceeds the language benefit (frequent tiny calls across the FFI boundary; marshaling-heavy interactions; small code size where the extra runtime complexity doesn't pay).
  • The embedded runtime has no credible story for concurrent data fetching (Haxl-style) — the C++ client calls would then be sequential through the FFI, defeating the whole benefit.

Operational notes from Sigma

  • Interactive development requires build-system work. Meta had to link all the C++ dependencies into a shared library GHCi could load so rule authors could test against production data sources. Sandwich architectures shift complexity into the build system.
  • Marshaling discipline matters. "If the whole data structure isn't required, it is better to marshal only the pieces needed." FFI marshaling cost is a first-class performance concern — selective marshaling is a recurring perf lever.
  • Embedded Lua in nginx (ngx_lua) — similar sandwich architecture with Lua in the middle of an nginx C server. Simpler runtime; less type safety; no Haxl-equivalent concurrency batching. See systems/ngx-lua.
  • Embedded WASM — modern variant where the middle layer is WebAssembly, giving process-isolation-like safety without an additional process. See systems/workerd.
  • Sidecar process — the alternative to sandwiching within a process: run the middle-layer logic in a separate process and talk over localhost. More blast-radius isolation, more overhead.

Seen in

Last updated · 319 distilled / 1,201 read