Skip to content

Airbnb: Privacy-first connections — Empowering social experiences

Summary

Airbnb is adding social features to Experiences (co-guest messaging, "Who's going" lists, a Connections profile section), and this post describes the identity model they built to keep those features privacy-first. The core architectural move is decoupling a single internal User ID from multiple context-scoped Profile IDs, so the same user can appear differently — or not at all — across different Experiences, and across host vs. guest contexts. Authorization is enforced at the data layer by Himeji, Airbnb's centralized authorization system, whose key performance trick is relation denormalization at write time so permission checks at read time stay fast as the privacy graph grows. Rollout required a company-wide audit-then-refactor migration (Python audit scripts → team ownership mapping → manual review → AI-assisted refactor with human-in-the-loop → strong typing and tests as a safety net) to ensure every access site used the right identifier in the right context.

Key takeaways

  1. Separate the internal entity from its public representations. User = complete internal record (name, email, phone, account). Profile = public-facing subset; a user can have multiple. Distinct types: Host Profile, Guest Profile, and now Experience-specific Guest Profile scoped to a single Experience.
  2. One User ID, many Profile IDs. Decoupling the identifiers is what makes context-aware privacy possible at the platform level — "each profile is visible only where it's relevant," and it's harder to link profiles across contexts (e.g., host profile ↔ guest profile on an unrelated Experience). This is concepts/identity-decoupling as a privacy primitive.
  3. Per-Experience guest profiles give per-event opt-in. A guest who goes private on an Experience only surfaces their first name to other co-guests outside their travel group; no photo, no guest stats, no "About me." Opting in on one Experience doesn't leak data to another.
  4. Himeji enforces at the data layer, not the UI. The in-house authorization system runs permission checks at the data access layer so privacy survives bugs in any particular UI or endpoint. Prior post: Himeji: a scalable centralized system for authorization at Airbnb.
  5. Write-time relation denormalization = fast read-time checks. Himeji precomputes derived relations when profile info or permissions change, trading extra write work for O(1)-ish permission lookups on the hot path — the classic denormalize-for-read pattern applied to access control. See systems/himeji.
  6. Least-privileged access as the default. Every actor class — fellow travelers, hosts, support personnel — only sees what their interaction actually requires.
  7. Migration pattern: audit scripts → ownership map → manual review → AI-assisted refactor → type safety. Python scripts located all user-data access sites by pattern match; directory layout mapped each site to an owning team; owners manually triaged internal vs. external use; AI refactoring tools proposed code changes with engineers always in the loop; strong typing prevented User IDs and Profile IDs from being swapped. Company-wide coordination (eng, product, privacy, legal) was called out as the critical ingredient.

Architectural nuggets

  • Identifier discipline: User ID and Profile ID are different types, not just different values — linters and type checks refuse to mix them. This is the enforcement mechanism that makes the privacy design hold up under normal feature velocity.
  • Profile materialization at booking time: Profile A is created for Marie when she joins "Pasta Making with Nonna"; Profile B is created when she joins "Goat Yoga." Each profile's visibility rules live with it, not with the underlying user.
  • Host vs. guest unlinkability: Alex hosting a cabin and Alex guesting on Goat Yoga are unlinkable from Marie's point of view if he opts for private — even though they're the same user internally.

Caveats

  • Post is mostly product/architecture narrative; no latency numbers, no scale figures (e.g., Himeji QPS, number of profiles per user, write-amplification cost of denormalization).
  • AI-powered refactoring is described generically ("AI-powered refactoring tools") without naming the tool or the acceptance rate.
  • Borderline for this wiki (privacy/product framing), included because of the Himeji authorization details and the identity-decoupling architectural pattern.

Source

Last updated · 200 distilled / 1,178 read