Skip to content

PATTERN Cited by 1 source

Polymorphic usage tables for multi-tenant scale

Pattern

When a multi-tenant SaaS needs to store per-tenant metadata or usage data for many entity types (fields, options, roles, schemes, work types, …), use a small fixed set of generic "polymorphic" tables keyed by something like (tenant_id, entity_type, entity_id, …) rather than one dedicated table per entity type. The decision avoids the combinatorial explosion of database objects that per-entity-type-table designs cause when the tenant count is large.

The 2026-05-14 Atlassian post is the first canonical wiki home for this pattern. Verbatim:

"Naively adding one dedicated table per entity doesn't scale in Jira's multi-tenant architecture: a single new table becomes millions of tables across production, plus indexes and internal database objects. So we used two layers: ... For persistent pre-computation, we store data in the Jira relational database using a small set of generic tables (polymorphic 'usage' tables) instead of one table per entity type."

(Source: sources/2026-05-14-atlassian-optimisation-tools-for-jira-reducing-configuration-bloat)

Problem

A naïve schema decision "one table per entity type" multi- plies into a problem at multi-tenant scale:

tables_added = entity_types × tenants  (per-tenant DB design)
total_db_objects ≈ tables_added × (1 + N indexes + M internals)

For Atlassian-scale Jira, each new dedicated table per entity type "becomes millions of tables across production, plus indexes and internal database objects."

The cost shape:

  • Postgres metadata RAM — each backend caches catalog entries for tables it touches; see concepts/catalog-bloat-multi-tenant for the Postgres-altitude cousin failure.
  • Operational overhead — schema migrations, monitoring, backup verification all per-table.
  • Internal DB-object pressure — indexes, constraints, triggers, sequences multiply alongside the tables.

A second naïve solution — use a wide JSON column on a single table — sacrifices index efficiency and schema validation; this pattern occupies the principled middle ground.

Solution

Use a small set of generic tables with shape like:

generic_usage_table
  tenant_id        bigint
  entity_type      enum-or-string  -- 'field' | 'option' | 'role' | ...
  entity_id        bigint
  scope_id         bigint          -- e.g. space, scheme, ...
  usage_count      bigint
  last_used_at     timestamp
  computed_at      timestamp
  ...
PRIMARY KEY (tenant_id, entity_type, entity_id, scope_id)

(schematic, not Atlassian-disclosed)

Several discrete tables can co-exist where the column shapes diverge enough to warrant separation, but the count is bounded by what columns matter rather than how many entity types exist.

What "small set" means

The 2026-05-14 post says "a small set of generic tables" — not a single mega-table. The number is bounded by the shape of the queries the data must support:

  • Usage counts — one table.
  • Last-used timestamps — possibly the same table.
  • Per-entity associations (which scheme contains which field) — possibly a different shape, separate table.

The discipline: avoid one-table-per-entity-type, not "merge everything into one table no matter what."

When to use

Use polymorphic usage tables when:

  • Tenant count × entity-type count is large enough that per-entity-type tables would exceed sensible operational bounds (Atlassian-scale: millions of tables).
  • Query patterns are predictable — usage queries are generally "how many (entity_type, entity_id) rows for this (tenant_id, scope_id)", not arbitrary schema-walking.
  • Per-entity-type column variance is bounded — the data per entity type fits the shared column shape without requiring per-row JSON columns or side-table joins that defeat the point.
  • The dataset is read-heavy and refreshed in batch — the typical SaaS-precomputation profile.

When not to use

Avoid polymorphic usage tables when:

  • Tenant count is bounded (single-digit or low hundreds) — per-entity-type tables work fine, schema stays clean.
  • Per-entity-type queries diverge significantly — if you need entity-specific indexes, EAV or per-table evolution is preferred.
  • Per-entity-type schema must evolve independently — shared columns force lockstep schema changes; if entity schemas need to diverge over time, separation is worth the cost.
  • The data is OLTP write-heavy — polymorphic tables can experience hotspots on the (tenant_id, entity_type) prefix; OLTP write hotspots are typically better served by per-entity-type tables that can scale-out independently.

Trade-offs

Property Polymorphic usage tables Per-entity-type tables
Total table count Bounded (small, fixed) Linear in entity-types × tenants
Catalog / metadata RAM Low High at scale
Operational overhead Low (few tables to migrate / monitor) High at scale
Per-entity-type indexes Composite index on (tenant, entity_type, ...) Per-table tailored
Schema evolution per entity Locked to shared shape Independent
Query complexity Always filter on entity_type Direct table reference
Hot-spot risk Higher (concentrated in fewer tables) Lower (spread across many tables)

Implementation discipline

  • Pre-determine the column shape so per-entity-type variance fits without expansion. JSON columns for "extra attributes" are an escape hatch; if used, accept they reduce indexing power.
  • Index the composite key carefully. The leading column should support the most common query predicate; for usage reports that's typically tenant_id (or a combined (tenant_id, scope_id)).
  • Consider per-tenant partitioning if the table grows large; partition on tenant_id rather than entity_type.
  • Monitor row-count growth — at "~1M records per entity type, per tenant" with multiple entity types and many tenants, the total row count can become very large; retention and archival policies become first-class concerns.

Generalises beyond usage data

The same pattern applies whenever a multi-tenant SaaS needs to store per-tenant metadata for many heterogeneous entity types:

  • Audit logs — a single audit_log table with (tenant_id, action, entity_type, entity_id, …) rather than per-entity-type audit tables.
  • Permission / ACL stores — a single permission table with polymorphic (subject_type, subject_id, resource_type, resource_id, action) rather than per-resource permission tables.
  • Notification subscriptions — a single subscription table with (user_id, source_type, source_id, channel) rather than per-source-type subscription tables.
  • Generic key-value tags on heterogeneous entities.

The discriminator: when the unifying key shape is consistent across the heterogeneous entity types, polymorphic tables collapse table-count cost dramatically without giving up indexability.

Seen in

Last updated · 542 distilled / 1,571 read