PATTERN Cited by 1 source

Async middleware inspection¶

Shape¶

An application-server middleware (e.g., a framework-provided after filter) intercepts each outbound response, performs a minimal synchronous step — sampling decision + candidate extraction — then hands expensive work (permission checks, policy evaluation, warehouse writes) to an asynchronous background job. The request completes on its normal latency budget; if the async inspection fails, the user-facing path is unaffected.

This is the shape that makes continuous detection on every request affordable without blowing the latency SLO.

Why split sync vs async¶

Inspection work splits naturally into two phases:

Cheap, on-path (sync): decide whether to sample this request; if sampled, parse the response body, extract candidate identifiers, record request-local metadata. Tens of microseconds.
Expensive, off-path (async): call the authorization engine per identifier, materialize permission decisions, query the sensitivity catalog, write findings to the warehouse, aggregate for dashboards. Tens to hundreds of milliseconds.

Running the expensive phase synchronously would push p99 well past budget. Running the cheap phase asynchronously would lose the request-local context (authenticated user, response body being shipped now) that makes the inspection possible.

Reference implementation¶

From Figma's Response Sampling (Source: sources/2026-04-21-figma-visibility-at-scale-sensitive-data-exposure):

Hook: Ruby after filter — framework-provided, runs after handler completes, before response flushes to client.
Sync phase (in the filter):
Check sampling rate → decide inspect or skip.
If sampled: parse JSON body; extract file identifiers (Phase 1) or read request-local storage of banned_from_clients-tagged values and compare against the serialized JSON (Phase 2).
Async phase (enqueued): permission re-verification via PermissionsV2 for Phase 1; finding logging for Phase 2.
Fault tolerance: "If sampling or verification fail, the request still completes normally, and errors are logged for monitoring." The inspection is strictly observational from the request's perspective.

The request-local-storage trick (Phase 2)¶

The after filter alone can't tell which sensitive values the handler actually loaded — the JSON response contains strings that might or might not have come from sensitive columns. Figma solves this with an ActiveRecord callback:

On sampled requests, when a record with a banned_from_clients column loads, the callback records the column's value into a request-local dictionary.
By the time the after filter runs, the dictionary contains the precise set of sensitive values this request touched.
The filter compares those values against the serialized JSON — any appearance is a finding.

This is a clever instance of two-phase instrumentation: record on the sync path, inspect on the way out. It avoids both coincidental-match false positives and global overhead on unsampled requests.

Properties¶

Bounded sync overhead: the sampler's sync cost is a parse
pattern-match, not an RPC.
Fault-isolated async: async worker failures don't cascade into user-facing errors.
Retry-safe async: permission checks are idempotent; finding writes are idempotent by (request-id, identifier) key.
Rate-limited: the async queue has its own rate limit so sampling surges don't overwhelm downstream systems.

Where to find the `after` hook¶

Ruby on Rails / Sinatra: after_action / after block.
Express.js: response finish event or a final middleware after res.send.
Spring / Java servlets: HandlerInterceptor.afterCompletion.
Go net/http: middleware that wraps a ResponseWriter and intercepts Write / WriteHeader, or a context-scoped on-finish hook.
gRPC: interceptor with a handler wrap that inspects the outbound message post-serialization.

The hook must fire after handler execution (so the body is materialized) but before the response is irreversibly handed off to the user (so async work can still be tied to the request).

Anti-patterns¶

Running expensive work synchronously because "it's only a few ms average" — the tail dominates; p99 users pay.
Capturing state for the async job via mutable globals — race conditions across concurrent requests. Use request-local storage
copy captured values into the job payload.
No timeout / no retry cap on the async job — a pathological record keeps a worker slot forever.
Forgetting to log async failures — silent failure mode; the detection pipeline quietly stops detecting.

Contrast: sidecar / proxy inspection¶

A sidecar (Envoy) or egress proxy sees responses too, but:

Lacks the authenticated-user / policy-engine context needed for user-aware checks.
Can't hook ORM callbacks (Phase-2 value tracking).
Forces RPC overhead for any application-layer lookup.

App-server middleware has the context for free; it's the natural home for this pattern when the inspection needs user or DB context.

Seen in¶

sources/2026-04-21-figma-visibility-at-scale-sensitive-data-exposure — Figma Response Sampling's sync+async split.