Skip to content

CONCEPT Cited by 1 source

Heap-organized table

A heap-organized table (also heap table, unordered heap) stores rows in a pile of unordered pages with no built-in physical sort order. Rows go wherever the storage engine has free space; the primary-key index (if any) is a separate data structure that points into the heap with physical row locations. This is the default storage shape for Postgres, Oracle (by default), and Microsoft SQL Server (when no clustered index is defined).

Canonical framing

Justin Gage canonicalises the taxonomic split verbatim:

An index is not magic – it's a database structure that contains pointers to specific database records. Without an index, data in a database usually gets stored as a heap, basically a pile of unordered, unsorted rows. In fact, this is actually [a setting you can toggle in Microsoft SQL Server and Azure SQL Database].

sources/2026-04-21-planetscale-how-do-database-indexes-work

The SQL Server pointer is load-bearing: heap-vs-clustered is not a vendor quirk but an engine design decision that some engines even expose as a configurable per-table setting.

Heap vs clustered-index storage

Property Heap-organized (Postgres) Clustered (InnoDB)
Row location Anywhere in the heap Inside the PK B+tree leaves
PK lookup Walk PK index → heap fetch Walk PK B+tree → row already in leaf
Secondary index leaf value Heap tuple ID (ctid) Primary key
Physical order Insertion order (mostly) Primary-key order
PK choice affects layout No Yes — sharply
Range scan by PK Random I/O across heap Sequential via leaf linked list
MySQL/Postgres example Postgres MySQL / InnoDB

Postgres is the canonical wiki instance of heap-organised tables: rows live in heap pages tagged with (page, offset) tuple IDs; every index — including the primary key — is a separate B-tree pointing at those tuple IDs.

MySQL vs Postgres primary-key storage

Gage's one-paragraph framing of the divergence:

MySQL and Postgres have a pretty major difference between them in how they handle primary key storage. In MySQL, primary keys are stored with their data, which means they effectively take up no extra storage space. In Postgres, primary keys are treated like other indexes and stored in separate data structures.

Consequences that flow from this split:

  • **InnoDB PK lookup = one walk; Postgres PK lookup = walk
  • heap fetch.** On equal-pressure workloads, InnoDB has a structural advantage on point lookups by PK.
  • PK choice is load-bearing in InnoDB, ambient in Postgres. A random UUID primary key in InnoDB scatters row writes across leaves (see concepts/uuid-primary-key-antipattern); the same key in Postgres just scatters tuple IDs but the heap accepts any insertion order uniformly.
  • MVCC/vacuum pressure differs. Postgres's heap is where MVCC dead tuples accumulate and where VACUUM reclaims space; InnoDB's equivalent churn lives inside the clustered B+tree leaves and uses the undo log for old-version retention.

SQL Server's heap toggle

SQL Server (and Azure SQL Database) exposes heap-vs-clustered as an explicit per-table choice. A table without a clustered index is heap-organised; adding a clustered index rewrites the storage layout into a B+tree keyed on that clustered-index column. Microsoft's docs call out heap tables as the default for staging / ingestion tables where write throughput matters more than read locality; clustered indexes are added downstream for analytical scans.

Caveats

  • "Pile of unordered rows" is an approximation. Postgres heaps preserve insertion order within pages until an UPDATE produces a new tuple version (which may go on a different page under MVCC) or VACUUM FULL rewrites the heap. Insertion order is a tendency, not an invariant.
  • Heap ≠ no order on disk. A heap-organised table can still be physically ordered by a clustering operation (Postgres CLUSTER command, Oracle ORGANIZATION INDEX) — but that order decays after subsequent writes. It's not a maintained invariant the way a clustered index is.
  • The heap toggle is SQL-Server-specific in the explicit sense. InnoDB is always clustered (you can't opt out without using a different engine); Postgres is always heap-organised (tablespace settings don't change this).

Seen in

  • sources/2026-04-21-planetscale-how-do-database-indexes-work — Justin Gage canonicalises the heap-vs-clustered split as a taxonomic contrast and flags MS SQL Server / Azure SQL Database as the engines that expose it as a configurable setting. Two-sentence MySQL-vs-Postgres PK-storage divergence is also canonicalised.
Last updated · 550 distilled / 1,221 read