PATTERN Cited by 1 source

Embedded OPA in proxy¶

Problem¶

You want policy-based authorization enforcement on every HTTP request, at scale. The canonical OPA deployment is either "OPA-as-sidecar" (one OPA container next to every app pod) or "OPA-as-service" (a central OPA cluster behind a gRPC/HTTP ext-authz call from the proxy). Both shapes add a network hop, inflate deployment count by N (applications), and force every app team to inherit OPA operational responsibility.

Solution¶

Embed OPA as a library inside the ingress proxy process itself. The policy engine, bundle loader, decision cache, and status / decision-log streams all share the proxy's address space. No sidecars. No extra deployments. No network hop on the hot path. Authorization becomes a filter in the proxy's filter chain.

Shape¶

One process per proxy replica hosts OPA as a Go (or equivalent) library.
The proxy exposes a filter primitive (opaAuthorizeRequest(…) in Skipper terms; could be an ext-authz-aligned filter elsewhere) that routes may reference.
The filter takes an application / bundle identifier; the proxy spawns one virtual OPA instance per unique identifier seen across its routes.
Per-instance state: policy bundle, labels, decision cache, per-instance status / decision-log streams.
Bundle source is external (object storage, bundle server); the embedded OPA polls on interval.
Observability: each decision emits a span; control-plane round-trips (bundle fetch, status, decision log) emit their own spans.

Why this shape works¶

Zero-hop latency. The ingress already parses the request; a library call into the in-process engine is cheaper than an ext-authz gRPC.
No per-app deployment tax. Previously N apps implied N OPA sidecars / deployments. Embedding collapses to one OPA runtime per proxy replica, with virtual instances multiplexing tenants.
Shared lifecycle. The proxy's hot-reload, graceful drain, fleet-wide config propagation all automatically cover OPA.

Trade-offs¶

OOM + CPU fate shared with the proxy. Any OPA OOM is a proxy OOM. Mitigated by patterns/bounded-telemetry-data-structures-for-policy-engine: bound bundle size, bound request-body parsing, bound decision / status buffers.
Blast radius coupled. A policy-evaluation bug that escalates is a data-plane incident, not a sidecar incident. Recovery is via proxy rollback.
Less hardware-level isolation between tenants. The same proxy process hosts all virtual instances; a memory-exhaustion policy bug in one application can pressure the shared process. Policy- size caps partially address this.
Language coupling. The proxy and OPA must share a language / runtime that admits embedding (Go for Skipper + OPA — natural fit).

Seen in¶

sources/2024-12-05-zalando-open-policy-agent-in-skipper-ingress — Zalando's canonical embedding inside Skipper: "Embedding OPA directly within Skipper as a library ensures minimal latency in policy enforcement by keeping policy decisions local to the ingress data plane. It also is cost efficient compared to running an OPA deployment per application or as sidecars." Paired with patterns/virtual-policy-instance-per-application for multi- tenancy, patterns/s3-as-policy-bundle-source-for-availability for control-plane failure tolerance, and patterns/bounded-telemetry-data-structures-for-policy-engine for OOM control.