Rolling Out Santa Without Freezing Productivity: Tips from Securing Figma's Fleet¶
Summary¶
Figma's Endpoint Security team rolled out Santa —
the Google-originated open-source macOS
binary-authorization tool — to 100% of company laptops over roughly
three months, ending with ~150 global allowlist rules, ~80,000
package-generated binary rules, a median of 3 personal rules per user,
and a P95 of 3–4 blocks per week in steady state with over 90%
resolved via self-service. The interesting content is the rollout
methodology — not the tool itself. Four load-bearing design decisions
let them ship without a productivity freeze: (1) run Santa in
monitoring mode first to mine a data-driven allowlist from real
fleet execution; (2) build a self-service Slack-based approval flow
so a block becomes a 3-second unblock for the majority of cases;
(3) auto-generate Package Rules every 30 minutes to keep Homebrew /
GitHub-release binary hashes current; (4) progress through explicit
percentage cohorts (10% → 25% → 50% → 70% → 98% → 100%) gated by
per-device unknown-binary counts, with a /santa disable escape
hatch that reverts individual machines to monitoring mode while the
team investigates. Sync server is a fork of Airbnb's
Rudolph. The post also covers rule-type
trade-offs (TeamID breadth vs SigningID precision), MDM-triggered
santactl sync cutting enforcement latency 60s → 3s, and
group-scoped permissive rules for engineers/data scientists whose
ad-hoc codesign produces per-machine binary hashes.
Key takeaways¶
-
Monitoring mode before enforcement. Deploy Santa in passive mode for an extended period, log every binary execution across the fleet, then build the allowlist from the observed distribution of
UNKNOWNevents. Most-executed binaries get SigningID/TeamID rules first, covering "the majority of binary executions." Remaining unknowns split into three categories (Apple-developer-signed without rule yet, unsigned locally-built, unsigned from package managers) with different mitigations per category. Classic monitoring-mode allowlist discovery. (Source: this article) -
File Access Authorization shipped first as an early win. Santa's FAA feature restricts which processes can read specific files; Figma uses it to lock browser cookies to the browser application only, "significantly reducing credential-theft risk even from scripts attempting unauthorized access." Zero user-workflow impact → shippable before the harder binary-authorization work. The tool was on every laptop months before lockdown mode was enabled on any.
-
Rule type trade-off: TeamID breadth vs SigningID precision. TeamID allows everything signed by a developer's Apple Team ID (e.g., LogMeIn's
GFNFVT632V) — low maintenance but broad (also allows LogMeIn Rescue remote-access tool). SigningID pins one specific signing identity (e.g.,EQHXZ8M8AV:com.google.Chrome) — precise but labour-intensive when one app has separate executables for main / helper / updater and a new version introduces new SigningIDs. Default to SigningID; reserve TeamID for highly-trusted developers with portfolios Figma wants to allow wholesale + complex apps with multiple SigningIDs; gate new TeamID rules behind a stricter developer-portfolio review. -
Self-service Slack approval closes the loop on blocks. Rather than customise Santa's native client GUI, Figma wired the sync server to post block events to a Slack app (accessible only on Figma-managed devices). The app runs automated malware checks (ReversingLabs + internal risk signals), and if clean, offers approve / do nothing / flag as malware buttons; approval creates a machine-specific rule applied to that user's laptop only. Canonical patterns/self-service-block-approval. Stricter policies can be imposed on higher-risk roles. Over 90% of blocks in steady state resolve via this flow (no security-team ticket).
-
MDM-triggered immediate sync cuts unblock latency 60s → 3s. Santa's default pull-sync interval is 60 seconds, so a self-approved rule takes up to a minute to land and the app stays blocked in the meantime (retried blocks annoy users). Google's internal solution uses Firebase Cloud Messaging — not publicly available. Figma built a package that makes an API call to their MDM server to trigger
santactl syncon the target device, cutting enforcement latency to ~3 seconds. -
Package Rule auto-generation keeps Homebrew/GitHub rules fresh. Binary rules are by SHA-256 and invalidate on every upstream upgrade. Manual rotation doesn't scale. Figma defines high-level Package rules in config as code (e.g.,
{package_type: "homebrew", package: "vim"}) in the same GitHub repo as other Santa config. Every 30 minutes, a workflow on macOS runners fetches the current SHA-256 from the official source (Homebrew bottle, GitHub release asset) and emits fresh Binary rules to the sync server. Must run on the same architecture as the fleet — ARM / x86 / both — because hashes differ. Ended up with ~80,000 auto-generated Binary rules from ~200 Package rules. Canonical patterns/package-rule-auto-generation. -
TCC-permission regression blocks self-service approvals. Self-service approval doesn't bypass other endpoint controls. If a newly self-approved application later requests Accessibility or Full Disk Access (sensitive TCC — Transparency, Consent, and Control — permissions), a separate osquery-based system detects the permission grant and automatically unsets it until security reviews. A rare outcome is a global block replacing the approval because of the newly-requested sensitive permission.
-
Staged rollout as explicit percentage cohorts with per-cohort inclusion criteria. Canonical patterns/cohort-percentage-rollout:
- 10%: users with zero unknown binaries in the past month.
- 25% / 50% / 70%: users with <10 unique unknown binaries, investigating and adding rules as issues arose; all new hires enter lockdown mode at 70%.
- Final 30% (engineers + data scientists): Anaconda ad-hoc-signs
each Python binary locally via
codesign, producing per-machine unique hashes; fleet-wide permissive Compiler or PathRegex rules would blow the security posture. Addressed by enhancing the sync server for group-based rule sets (scope permissive rules to the exact group that needs them). - 98%: pause one month, investigate machines with >10 unknowns
and machines that used
/santa disable, add rules. -
100%: retire
/santa disable, promote lockdown mode into Endpoint Security Baseline (ESB) — out-of-compliance = blocked from internal systems. -
/santa disableas explicit rollout escape hatch. During the rollout only, any user can run/santa disablein Slack to revert their machine to monitoring mode while security investigates their blocks. Canonical patterns/rollout-escape-hatch (sibling to the more general patterns/emergency-bypass); retired at 100% cohort when it would otherwise be a permanent weakening of the security posture. Monitor/santa disableusage as a rollout-pain signal. -
Static allowlist for critical apps hedges against sync failure. As Package Rules grew to 80K Binary rules, initial sync times for new machines stretched to several minutes + occasional connection-drop mid-sync → incomplete rule set → MDM / Chrome / Slack / Zoom block on a brand-new laptop. Mitigation: static allowlist rules in Santa's local config for critical apps (MDM, Chrome, Slack, Zoom). Critical apps work regardless of sync state. patterns/static-allowlist-for-critical-rules.
-
Known limitation — Compiler rules vs
go run. Compiler rules auto-allow binaries produced by specified compilers. Works withgo buildthen run. Withgo run(compile + execute in one step), Santa can't create the Binary rule for the compiled output fast enough → race → user block (upstream issue). Workarounds: update scripts to usego build, or add scoped PathRegex rules.
Production numbers disclosed¶
- Fleet coverage: 100% of Figma laptops in lockdown mode.
- Rollout time: ~3 months from 10% cohort to 100%.
- Global allowlist: ~150 rules using SigningID + TeamID.
- Global blocklist: ~50 rules (includes unapproved remote-access tools).
- Package rules: ~200 Package rules → ~80,000 generated Binary rules.
- Compiler rules: ~10.
- PathRegex rules: ~50 (tightly scoped; target: reduce over time).
- Personal rules: median 3 per user.
- Block volume: P95 3–4 per week per user in steady state.
- Self-service resolution rate: >90%.
- Self-approval enforcement latency: 60s → 3s via MDM-triggered
santactl sync. - Package-rule refresh cadence: 30 minutes, on macOS runners matching fleet architecture.
Caveats / scope¶
- Security-ops / endpoint-management content, not core distributed-systems internals. Ingested under Tier-3-selective Figma-equivalent scope because the rollout methodology + self-service unblock loop + config-as-code package-rule system + cohort rollout with per-cohort inclusion criteria are reusable at-scale operational-infra patterns (feature-flag / canary / progressive-delivery lineage), and the post discloses real fleet-level numbers.
- Not ingested: malware-specific content (ReversingLabs call-outs,
CVE references) and Figma's osquery-based TCC enforcement (one
paragraph, no internals).
figma-endpoint-security-baselineis a policy bundle, not a new system page worth creating.
Source¶
- Original: https://www.figma.com/blog/rolling-out-santa-without-freezing-productivity/
- Raw markdown:
raw/figma/2026-04-21-rolling-out-santa-without-freezing-productivity-tips-from-se-d68eded2.md - Cross-reference: Figma's modern endpoint strategy (referenced but not yet ingested).
- Upstream: Santa project (North Pole Security, the Santa maintainers post-Google-release).
- Upstream: Rudolph sync server (Airbnb OSS, Figma's sync-server base).
Related¶
- systems/santa — the client being rolled out.
- systems/rudolph — the Airbnb-OSS sync server Figma forked.
- concepts/binary-authorization — the security control being operationalised.
- concepts/device-trust — complementary workstation-security posture.
- patterns/data-driven-allowlist-monitoring-mode — the monitoring-first rollout methodology.
- patterns/self-service-block-approval — the Slack-based unblock loop.
- patterns/package-rule-auto-generation — the config-as-code hash-rotation pipeline.
- patterns/cohort-percentage-rollout — the 10% → 100% staged rollout.
- patterns/rollout-escape-hatch — the
/santa disableper-machine revert. - patterns/static-allowlist-for-critical-rules — the hedge against sync-timeout at 80K rules.
- companies/figma — Figma company page.