Skip to content

FIGMA 2026-04-21 Tier 3-equivalent

Read original ↗

Figma — How We Built a Custom Permissions DSL at Figma

Summary

Figma's engineering team rebuilt permissions enforcement from a Ruby-monolith has_access? method — a growing tangle of if/else branches mixing policy logic with ActiveRecord database calls across Sinatra HTTP backend + LiveGraph realtime API — into a custom cross-platform declarative permissions DSL. The design draws inspiration from AWS IAM policies (effect + actions + resource + condition) but rejects Open Policy Agent, Zanzibar, and Oso after investigation — none matched the four problems Figma needed solved. The final system has three building blocks: ExpressionDef (a JSON-serializable triples-based boolean logic DSL of the shape [fieldName, op, value|ref] composed by and / or / not), ApplyEvaluator (a tiny library in each target language that walks an ExpressionDef over a dictionary of loaded data and returns true / false / null), and DatabaseLoader (per-language policy-unaware data loader that resolves "file.id"-style strings to concrete rows via a context_path computed from the input (resource, user)). Policies are written in TypeScript with convenience types/enums/functions that compile down to ExpressionDef JSON, so the same policy runs unchanged against Ruby / TypeScript / Go evaluators with shared test suites. The substrate also unlocks (a) a React front-end debugger that loads a user+resource, feeds data through the evaluator in the browser, and renders the boolean tree with per-node truth + data values; (b) a CLI debugger with the same tree output; (c) progressive / short-circuit data loading driven by a new third evaluator state null (indeterminate) that more than halved total permission evaluation time; and (d) static-analysis linters on policy logic (e.g. field = ref comparisons without a sibling field <> null guard) running in CI instead of at runtime. Author's framing: "If you told me when we started this project that we were going to end up designing and implementing our own bespoke authorization DSL, I wouldn't have believed you."

Key takeaways

  1. The four named problems the DSL exists to solve. (Source: article body.) (a) has_access? was monolithic and scary — a bug could leak access to every file in Figma; engineers would ask "can we do that outside of has_access??" or "why don't we just do this in the client?" (b) Hierarchical integer permissions levels
  2. optional boolean flags (ignore_link_access, ignore_archived_branch, org_candidate, …) produced a non-hierarchical, resource-specific matrix pretending to be hierarchical — "level 300 access does not imply level 100 + ignore_link_access: true." (c) Database-load and policy-logic coupling — has_access? was a plain Ruby function mixing both, and permissions checks were ~20% of the database load (existential against the vertical-scale ceiling documented in sources/2026-04-21-figma-how-figmas-databases-team-lived-to-tell-the-scale). (d) Cross-platform drift — permissions logic had to exist in both Sinatra and LiveGraph; "many rules were not migrated properly and bugs resulted from the discrepancies."

  3. IAM as inspiration, not substrate. IAM policies give Figma the shape it wanted — effect (allow / deny), action set, resource scope, conditions — with the property that one policy author doesn't have to reason about every other existing policy. Figma explicitly calls out that IAM is "notoriously hard to work with" and not universally loved, but the isolation property is what transferred. See systems/aws-iam, concepts/fine-grained-authorization.

  4. First PoC collapsed "what data do you need" into the policy class — and it didn't scale. The initial Ruby AccessControlPolicy carried resource_type :file, effect :deny, permissions […], an attached_through :resource attribute, and a plain Ruby apply?(resource:, **_) function. "attached_through allowed attaching of resources such as the user, roles, parent resources (project, team, org), or similar. This last concept proved unintuitive and limiting (you can only load one resource?)." Also — apply? was arbitrary Ruby, so the platform couldn't guarantee no network calls / no side effects. And cross-platform support via AST parsing of apply? felt "unreliable." Key lesson: the biggest de-risking step was porting every existing permissions rule into the PoC and getting a green CI branch — "through this work we were able to learn all the little intricacies of the permissions rules that engineers had added over the years and the product reasoning behind them." See patterns/policy-proof-of-concept-branch.

  5. JSON-serializable ExpressionDef is the load-bearing move. Figma extended an existing LiveGraph DSL based on triples into a policy logic substrate with three types: BinaryExpressionDef = [FieldName, op, Value | ExpressionArgumentRef], AndExpressionDef = { and: ExpressionDef[] }, OrExpressionDef = { or: ExpressionDef[] }, combined as ExpressionDef = BinaryExpressionDef | AndExpressionDef | OrExpressionDef. FieldName is a "table.column" string ("file.id", "team.permission"). The right side can reference another field via { type: 'field', ref: 'user.blocked_team_id' }. Key properties: easily consumable in any language (no AST parsing), statically analyzable for data dependencies via simple recursive walk, and the API policy authors see is field-reference-only. See concepts/json-serializable-dsl, patterns/expression-def-triples.

  6. Policy authorship in TypeScript with types, enums, and composable helpers. Policies import types (FilePermission, AccountType) and helper functions (and, or, not, exists, isOrgFile(File), teamUserHasPaidStatusOnFile(File, TeamUser, "=", AccountType.RESTRICTED)) that return ExpressionDef. The result compiles to a plain ExpressionDef JSON blob that every evaluator speaks. Worked example from the article: DenyEditsForRestrictedTeamUser extends DenyFilePermissionsPolicy with permissions = [FilePermission.CAN_EDIT_CANVAS], chosen as TypeScript because "how widely it was already used at Figma, its type system, and how easy it was to serialize objects to JSON." LiveGraph was already TypeScript. See systems/figma-permissions-dsl.

  7. ApplyEvaluator + DatabaseLoader as the split that enables parallel evolution. The evaluator knows nothing about the database; the loader knows nothing about policies. Two-to-three days per language for a senior engineer to write a new evaluator — Figma has Ruby / TypeScript / Go implementations, with shared test suites running against all three for consistency. The core algorithm: collect all policies matching the permission name, parse their ExpressionDef recursively to build a { table: [columns…] } data-dependency map, hand that plus a context_path (IDs for each resource derivable from the input file + user) to the loader, then evaluate deny-first: if any deny-policy is true, return false; else if any allow-policy is true, return true. See concepts/data-policy-separation, patterns/deny-overrides-allow.

  8. context_path as the resource-addressing primitive. Given file.has_permission?(user, CAN_EDIT), the loader must query rows — which ones? Figma gives each ActiveRecord model a context_path returning a {symbol: id} map (file.context_path → {project: ..., team: ..., org: ..., file: ...}, user.context_path → {user: ...}) plus a merge rule at the call site that produces composite keys like org_user: [org_id, user_id] and team_role: [team_id, user_id]. New resource types register their context_path; policy authors reference "team_role.level" and never reason about which row.

  9. Three-valued short-circuit evaluation halves evaluation time. Once the separation of policy from data was clean, Figma introduced a third ApplyEvaluator state, null / indeterminate: a sub-expression that can't be resolved yet (its fields are in PENDING_LOAD state). Under and, null + false = false (exit early); under or, null + true = true (exit early). If the aggregated answer is still null, load the next batch and try again. Figma partitions the dependency set into sequential load steps by heuristic — "File, folder, and team roles are the second most common way (after link access) by which users gain access to resources; we prioritized these in our load plan." This optimization "more than halved the total execution time of our permissions evaluation." See concepts/three-valued-logic, patterns/progressive-data-loading.

  10. Policy static analysis in CI as a compile-time safety net. Because ExpressionDef is JSON, a TypeScript linter recursively walks every policy and flags two named bug classes: (a) a BinaryExpressionDef with = against a field-reference where both sides could be null (evaluates to true but is almost never the author's intent) — require a sibling <> null guard under an and; (b) analogous guard for <>. Deliberately implemented as build-time static analysis rather than runtime engine enforcement: (i) no cross-platform duplication (CI runs once), (ii) "bugs faster and didn't have to wait to hit these while testing or in prod or staging," (iii) "we understood that the approach of having multiple engines is only viable because the engine is so simple. We didn't want to make changes to the engine unless we really had to." See patterns/policy-static-analysis-in-ci.

  11. Evaluator simplicity ⇒ ecosystem leverage. Because the evaluator is tiny and deterministic, Figma gets three additional products cheaply: a React front-end debugger (Figma-employee tool: input user ID + resource ID → backend loads all data → data ships to the browser → React integrates ApplyEvaluator recursively to render expandable-collapsible policy trees with per-node true/false/data values); a CLI debugger with the same tree output driven by an environment-variable flag on unit tests; and a generator that extracts (query, user, permission) → (table, column) data dependencies for auditing. Figma stated outcome: "we all but eliminated incidents and bugs caused by drifts in the logic between our Ruby and LiveGraph codebase."

Mechanics

The DSL core (TypeScript types)

export type FieldName = string;              // "file.name", "user.email"
export type Value = string | boolean | number | Date | null;

export type BinaryExpressionDef = [
  FieldName,
  '=' | '<>' | '>' | '<' | '>=' | '<=',
  Value | ExpressionArgumentRef,
];

export type ExpressionArgumentRef = { type: 'field'; ref: FieldName };

export type ExpressionDef =
  | BinaryExpressionDef
  | { or: ExpressionDef[] }
  | { and: ExpressionDef[] };

A representative policy (TypeScript authoring surface)

class DenyEditsForRestrictedTeamUser extends DenyFilePermissionsPolicy {
  description = 'This user has a viewer-restricted seat in a Pro plan, ...';

  applyFilter: ExpressionDef = {
    and: [
      not(isOrgFile(File)),
      teamUserHasPaidStatusOnFile(File, TeamUser, '=', AccountType.RESTRICTED),
    ],
  };

  permissions = [FilePermission.CAN_EDIT_CANVAS];
}

…compiles down to:

{
  "and": [
    { "not": [["file.orgId", "<>", null]] },
    { "or": [
      { "and": [
        ["file.editor_type", "=", "design"],
        ["team_user.design_paid_status", "=", "restricted"]
      ]},
      { "and": [
        ["file.editor_type", "=", "figjam"],
        ["team_user.figjam_paid_status", "=", "restricted"]
      ]}
    ]}
  ]
}

The evaluation algorithm

function hasPermission(resource, user, permissionName):
  policies        = ALL_POLICIES.filter(p => p.permissions.includes(permissionName))
  resourcesToLoad = policies.reduce(acc, p => acc ∪ parseDependencies(p.applyFilter))
  loadedResources = DatabaseLoader.load(resourcesToLoad)
  [denies, allows] = policies.bisect(p => p.effect == DENY)
  if denies.any(p => ApplyEvaluator.evaluate(loadedResources, p.applyFilter)): return false
  return allows.any(p => ApplyEvaluator.evaluate(loadedResources, p.applyFilter))

With progressive loading (3-valued):

for batch in partitionedLoadPlan:
  loadedResources ∪= DatabaseLoader.load(batch)
  verdict = evaluatePolicies(loadedResources, policies)   # true / false / null
  if verdict is not null: return verdict
return verdict ?? false

Static analysis (representative rule)

Disallow BinaryExpressionDef with = and a right-side ExpressionArgumentRef unless a sibling [field, '<>', null] exists under an enclosing and. Reason: null = null evaluates true in Figma's engine but is almost never the intended policy meaning.

Architectural significance

  • Canonical realization of concepts/data-policy-separation in a production authorization engine — policies are data, data is loaded by a separate engine, and neither side can force the other into coupled changes.
  • Canonical realization of concepts/json-serializable-dsl for cross-language policy evaluation without AST parsing — same design principle as Cedar (analyzable by construction, static reasoning over policies) and the IAM policy JSON format that inspired it, but sized to a single company's codebase rather than a multi-tenant platform.
  • Canonical realization of concepts/three-valued-logic applied to authorization for short-circuit data-loading savings; analogous to SQL NULL-logic and Kleene three-valued logic but chosen specifically for the "load-another-batch-and-ask-again" control loop.
  • Canonical realization of patterns/expression-def-triples — a boolean-logic DSL built from 3-tuples + and/or/not is a recurring shape (LiveGraph's original; Elasticsearch's query DSL; MongoDB's aggregation filters; Datalog's atoms).
  • Reinforces concepts/policy-as-data — the Convera (sources/2026-02-05-aws-convera-verified-permissions-fine-grained-authorization) canonical instance is about storing Cedar policies in DynamoDB; Figma's is about making the policy language itself a serializable data structure, one level deeper.
  • Proof point that build-your-own DSL can beat off-the-shelf policy engines when the four named problems (monolithic core, non-hierarchical granularity, data-policy coupling, cross-platform drift) match a specific product shape — Figma evaluated OPA, Zanzibar, Oso and rejected all three.

Caveats / not covered

  • No numbers disclosed for (a) number of policies in production, (b) permissions evaluations/sec, (c) p50/p99 latency before vs after the migration, (d) exact % database-load reduction (only "permissions checks were around 20% of the database load" pre-DSL is given; post-DSL delta is "more than halved" evaluation time, not database load).
  • No discussion of policy authoring velocity metrics — the "engineers no longer have to reason about other policies" claim isn't quantified.
  • No discussion of how the DSL handles negation within ExpressionDef beyond not(...) appearing in code samples — the type declaration in the article doesn't list NotExpressionDef as a top-level variant; inferred from the example to exist.
  • Policy ordering / priority across Allow and Deny is not specified in detail beyond "DENY overrides ALLOW."
  • Performance of the static-analysis linter at scale not quantified (how many rules, build-time cost).
  • Trade-offs of the three-day-per-language evaluator port not enumerated — Go evaluator mentioned but its use case not explained.
  • Database-loader query-plan optimization strategy not detailed beyond "full control to the backend engine … in which order, using which queries, using replicas or primary databases, using whichever interface to the database, using caching." No specific examples, no specific perf numbers.
  • Historical timeline: engineering work started in "early 2021"; exact ship date of the DSL in production not given.
  • Front-end debugger React-component architecture described but not shown; no screenshots in captured raw.

Raw

Last updated · 200 distilled / 1,178 read