PATTERN Cited by 1 source
Async middleware inspection¶
Shape¶
An application-server middleware (e.g., a framework-provided
after filter) intercepts each outbound response, performs a
minimal synchronous step — sampling decision + candidate
extraction — then hands expensive work (permission checks, policy
evaluation, warehouse writes) to an asynchronous background job.
The request completes on its normal latency budget; if the async
inspection fails, the user-facing path is unaffected.
This is the shape that makes continuous detection on every request affordable without blowing the latency SLO.
Why split sync vs async¶
Inspection work splits naturally into two phases:
- Cheap, on-path (sync): decide whether to sample this request; if sampled, parse the response body, extract candidate identifiers, record request-local metadata. Tens of microseconds.
- Expensive, off-path (async): call the authorization engine per identifier, materialize permission decisions, query the sensitivity catalog, write findings to the warehouse, aggregate for dashboards. Tens to hundreds of milliseconds.
Running the expensive phase synchronously would push p99 well past budget. Running the cheap phase asynchronously would lose the request-local context (authenticated user, response body being shipped now) that makes the inspection possible.
Reference implementation¶
From Figma's Response Sampling (Source: sources/2026-04-21-figma-visibility-at-scale-sensitive-data-exposure):
- Hook: Ruby
afterfilter — framework-provided, runs after handler completes, before response flushes to client. - Sync phase (in the filter):
- Check sampling rate → decide inspect or skip.
- If sampled: parse JSON body; extract file identifiers (Phase
1) or read request-local storage of
banned_from_clients-tagged values and compare against the serialized JSON (Phase 2). - Async phase (enqueued): permission re-verification via PermissionsV2 for Phase 1; finding logging for Phase 2.
- Fault tolerance: "If sampling or verification fail, the request still completes normally, and errors are logged for monitoring." The inspection is strictly observational from the request's perspective.
The request-local-storage trick (Phase 2)¶
The after filter alone can't tell which sensitive values the
handler actually loaded — the JSON response contains strings
that might or might not have come from sensitive columns. Figma
solves this with an ActiveRecord callback:
- On sampled requests, when a record with a
banned_from_clientscolumn loads, the callback records the column's value into a request-local dictionary. - By the time the
afterfilter runs, the dictionary contains the precise set of sensitive values this request touched. - The filter compares those values against the serialized JSON — any appearance is a finding.
This is a clever instance of two-phase instrumentation: record on the sync path, inspect on the way out. It avoids both coincidental-match false positives and global overhead on unsampled requests.
Properties¶
- Bounded sync overhead: the sampler's sync cost is a parse
- pattern-match, not an RPC.
- Fault-isolated async: async worker failures don't cascade into user-facing errors.
- Retry-safe async: permission checks are idempotent; finding writes are idempotent by (request-id, identifier) key.
- Rate-limited: the async queue has its own rate limit so sampling surges don't overwhelm downstream systems.
Where to find the after hook¶
- Ruby on Rails / Sinatra:
after_action/afterblock. - Express.js: response
finishevent or a final middleware afterres.send. - Spring / Java servlets:
HandlerInterceptor.afterCompletion. - Go
net/http: middleware that wraps aResponseWriterand interceptsWrite/WriteHeader, or acontext-scoped on-finish hook. - gRPC: interceptor with a
handlerwrap that inspects the outbound message post-serialization.
The hook must fire after handler execution (so the body is materialized) but before the response is irreversibly handed off to the user (so async work can still be tied to the request).
Anti-patterns¶
- Running expensive work synchronously because "it's only a few ms average" — the tail dominates; p99 users pay.
- Capturing state for the async job via mutable globals — race conditions across concurrent requests. Use request-local storage
- copy captured values into the job payload.
- No timeout / no retry cap on the async job — a pathological record keeps a worker slot forever.
- Forgetting to log async failures — silent failure mode; the detection pipeline quietly stops detecting.
Contrast: sidecar / proxy inspection¶
A sidecar (Envoy) or egress proxy sees responses too, but:
- Lacks the authenticated-user / policy-engine context needed for user-aware checks.
- Can't hook ORM callbacks (Phase-2 value tracking).
- Forces RPC overhead for any application-layer lookup.
App-server middleware has the context for free; it's the natural home for this pattern when the inspection needs user or DB context.
Seen in¶
- sources/2026-04-21-figma-visibility-at-scale-sensitive-data-exposure — Figma Response Sampling's sync+async split.