Skip to content

PATTERN Cited by 1 source

Response sampling for authorization detection

Shape

Sample a configurable fraction of outbound API responses. Extract identifiers of permission-gated resources from the response body. Asynchronously re-verify that the requesting user had access to each identifier, against the canonical authorization engine. Log unexpected results for triage.

The pattern turns the preventive authorization system (which has to be right at every call site) into something an observer can spot-check continuously — catching the authorization flaws that slipped past prevention.

When to reach for it

  • You have a mature preventive authz engine (DSL, policy store, permission decisions) — prevention is in place, but you can't prove the absence of every IDOR / broken object-level authz bug.
  • Your response bodies contain resource identifiers (URL slugs, high-entropy IDs, etc.) that can be extracted with a regex or a schema walk.
  • The authz engine is callable in-process from the app server at arbitrary times (not only at request entry).
  • Your on-call / security triage workflow can absorb findings at whatever the sampler's rate produces.

Reference implementation — Figma Response Sampling (Phase 1)

From systems/figma-response-sampling (Source: sources/2026-04-21-figma-visibility-at-scale-sensitive-data-exposure):

  1. Hook point: Ruby after filter (runs after every request completes, before the response ships).
  2. Sample decision: uniform-random at a configurable rate across request paths.
  3. Extraction: parse JSON body, match file-identifier shape (high-entropy capability tokens, known charset + length).
  4. Async verification: enqueue a background job per extracted identifier. Job calls PermissionsV2 with (requesting_user, file_id) and records the decision.
  5. False-positive filter: rules encode known safe cases (e.g., identifiers that are part of the endpoint's public contract).
  6. Logging: unexpected findings land in the analytics warehouse + triage dashboards for on-call review.
  7. Non-blocking: sampling/verification errors never fail the user-facing request.

Production findings (what this actually catches)

  • Endpoint over-return: a response that included file IDs unnecessarily — triggered better data filtering.
  • Legacy path bypass: code paths where files bypassed permission checks entirely — gaps closed.
  • List without per-item check: responses returning a list of resources where access was verified on the parent but not each child — per-item permission checks added.

Each of these is a class of bug that a code review can miss, testing can miss, and pentest may or may not exercise. Response sampling exercises them continuously at production scale.

Design knobs

  • Sampling rate. Coverage vs overhead; tune from telemetry.
  • Per-path weighting. Might want higher rates on newer or historically-risky endpoints.
  • Identifier extraction strategy. Shape-based regex (cheap, approximate) vs schema-guided walk (precise, tied to response-type definitions).
  • Decision record retention. How long triage dashboards keep findings before aggregation.
  • Allowlist discipline. Every false-positive suppressor is a rule that can decay into under-detection — keep them scoped and reviewed (patterns/dynamic-allowlist-for-safe-exposure).

Why it requires middleware (not a proxy)

A proxy layer like Envoy sees request/response wire traffic but lacks:

  • The authenticated user object.
  • The authorization engine's execution context.
  • The ability to correlate response fields with the DB rows that produced them.

Application-server middleware has all three. Doing this at the proxy would require rebuilding user context and calling authz over RPC, which is slower and harder to keep correct.

Anti-patterns

  • Synchronous verification on the hot path. Adds p99 tail, directly visible to users. Always async.
  • Sampling only the happy path. The interesting failures are often in rare branches; uniform sampling across all request paths avoids blind spots.
  • Alerting on every extracted identifier. Until the false-positive filter covers the known-safe cases, the alert channel drowns and engineers stop listening.
  • Single global sampler without rate limiting. Traffic surges amplify into sampler-infra overload.

See also

Seen in

Last updated · 200 distilled / 1,178 read