SYSTEM Cited by 1 source
Figma Response Sampling¶
What it is¶
Figma's in-house security-detection system for sensitive data
exposure in API responses. A configurable fraction of outbound
responses from Figma's Ruby application server is asynchronously
inspected for: (Phase 1) file identifiers that the requesting user
should not have access to, and (Phase 2) any field tagged as
banned_from_clients by FigTag. Runs in both
staging and production as an observability layer on top of
PermissionsV2 — the detection
complement to prevention.
Architecture¶
Enforcement point: Ruby after filter + async jobs¶
- Implemented as middleware in the Ruby application server, using a
built-in
afterblock that runs after every request completes — a consistent place to inspect responses before they ship to the client. - Sampling is uniform-random across request paths at a configurable rate, tuned to balance coverage against overhead.
- Non-blocking: if sampling or verification fails, the request still completes normally; errors are logged for monitoring.
- Verification is executed in async jobs — the
afterfilter extracts candidates synchronously, enqueues the check, returns. - Rate limiting on the processing pipeline prevents resource exhaustion under surge.
Why middleware in the app server (not an Envoy proxy)¶
The app-server layer gives middleware direct access to:
- The authenticated user object — needed to evaluate permissions.
- The full API response body — needed to scan for sensitive identifiers/values.
- The internal permissions engine (PermissionsV2).
Doing this in Envoy or another proxy would require reconstructing user context and would make user-aware permission checks "significantly harder" — the three capabilities above exist together only at the application tier.
Phase 1 — Permission Auditor (file identifiers)¶
The bootstrapping implementation. File identifiers are the ideal starter data type because:
- Sensitivity and access rules are already well-defined in PermissionsV2.
- They are "high-entropy capability tokens with a known character set and consistent length" — trivial to detect in JSON bodies.
Flow:
afterfilter parses the JSON response body.- Extracts any strings matching the file-identifier shape.
- Enqueues an async job per identifier to re-verify user × identifier access via PermissionsV2.
- False-positive-suppression logic accounts for known safe cases (e.g., identifiers that are legitimately visible in a given endpoint's contract).
- Unexpected findings land in the analytics warehouse + triage dashboards.
Findings surfaced within days: file identifiers returned in responses unnecessarily (triggered better data filtering), paths where files bypassed permission checks entirely (gaps closed).
Phase 2 — Sensitive Data Analyzer ("fancy Response Sampling")¶
Generalizes the same pipeline to any column tagged
banned_from_clients by FigTag. Rather than
scanning the response body for a known pattern, Figma tracks which
sensitive values were loaded during the request:
- FigTag annotates every DB column with a sensitivity category; annotations propagate to the data warehouse and are queryable at request time.
- An
ActiveRecordcallback fires whenever a record with abanned_from_clientscolumn loads — on sampled requests, it records the loaded value into request-local storage. This avoids global overhead for unsampled requests. - After response generation, the
afterfilter inspects the serialized JSON and compares it against the recorded sensitive values. - If any sensitive value appears in the response, a finding is logged; results flow through the same unified warehouse + dashboards as Phase 1.
Why the callback+request-local trick: alternative approaches (post-hoc DB log scraping, response-body regex) can't tell whether a matched string was actually the sensitive database value or a coincidence, and can't scope per-request. The callback pins down exactly which sensitive values this request touched, so the inspection is precise without a static schema.
Cross-service integration (LiveGraph)¶
LiveGraph, Figma's real-time data-fetching service, submits sampled responses to an internal endpoint that funnels into the same processing pipeline. Keeps performance predictable:
- Sampling in LiveGraph gated by configuration + rate limiting.
- After LiveGraph produces a response, a lightweight API call hands the sampled data off; LiveGraph's real-time data flow is unaffected.
- Findings share the same schema and logging path — on-call engineers interpret alerts uniformly across sources.
Allowlisting (dynamic)¶
A flexible allowlisting process excludes endpoints with intentional, safe exposure (e.g., an OAuth client secret returned by a dedicated credential-management endpoint to an authorized user). Same value appearing in an unrelated response = critical finding. Config-driven (no redeploy), per-endpoint / per-field — this is what keeps the FP rate low enough for engineers to trust the alerts (patterns/dynamic-allowlist-for-safe-exposure).
Deployment posture¶
- Staging + production concurrently — two lines of defense: early detection before release + regression monitoring in prod.
- Asynchronous everywhere on the verification path so the user doesn't pay latency.
- Rate-limited pipeline to bound infrastructure cost.
Impact (disclosed in post)¶
- Caught long-unused data fields leaking into certain responses → targeted fix.
- Surfaced cases where related-resource data was included without a clear need → clean-up work.
- Highlighted responses returning a list of resources without verifying access for each item → stronger per-item permission checks.
- Closed authorization paths that bypassed permission checks entirely for file access.
Limits (gaps in the post)¶
- No disclosed sampling rate, QPS, or latency-overhead numbers.
- No disclosed false-positive or true-positive rate.
- No detail on async-job substrate (worker pool, retry, DLQ).
- Phase 2 is currently scoped to columns traversed by
ActiveRecord— non-ORM data paths would need a parallel hook. - Future work named in the post: finer-grained sampling controls, automated triage, richer trend reporting, broader PII + regulated data coverage, extension to non-API interaction channels.
Seen in¶
- sources/2026-04-21-figma-visibility-at-scale-sensitive-data-exposure — the canonical post introducing Phase 1 + Phase 2 + the cross-service LiveGraph integration.