SYSTEM Cited by 18 sources

PostgreSQL¶

What it is¶

PostgreSQL ("Postgres") is an open-source object-relational database system, first released in 1986 and continuously developed since. As of 2025 the codebase is >1M lines of C, with thousands of contributors.

Why it shows up on this wiki¶

Postgres is the query-processing substrate for systems/aurora-dsql: DSQL reuses Postgres's parser and planner and replaces replication, concurrency control, durability, storage, and session management via Postgres's public extension API. This is a canonical example of the patterns/postgres-extension-over-fork pattern — building a radically different system on top of Postgres without forking its codebase.

Extension points matter because they:

Let the extension code run in the Postgres process, but live in separate files/packages.
Allow the extension to change behavior without touching core Postgres code.
Preserve the ability to pull upstream improvements and new features.

(Source: sources/2025-05-27-allthingsdistributed-aurora-dsql-rust-journey.)

Seen in¶

sources/2026-05-07-databricks-how-lakebase-architecture-delivers-5x-faster-postgres-writes — First wiki quantification of the Full Page Write write- amplification ceiling and canonicalisation of the XLOG_FPW_CHANGE control record as a live-rollout vehicle. Databricks Engineering discloses FPW as up to 15× WAL-volume inflation on write-heavy workloads and "often the system's biggest performance bottleneck." Post canonicalises four new first-class Postgres-primitive wiki pages: concepts/postgres-full-page-write + concepts/torn-page + concepts/postgres-checkpoint (framed verbatim "Unlike a snapshot, a checkpoint is simply a milestone marker in the log") + concepts/delta-chain-replay. The architectural insight that drives the Lakebase remedy: in a stateless-compute- streaming-WAL-to-safekeeper-quorum architecture, the torn-page failure mode FPW exists to prevent structurally doesn't exist — FPW is redundant on compute. The XLOG_FPW_CHANGE Postgres control record is canonicalised as the in-log-feature-flag vehicle for rolling out breaking WAL protocol changes across a live fleet without restarts (see patterns/live-wal-protocol-switch-via-xlog-fpw-change). HammerDB TPROC-C benchmark + 56-vCPU production customer anchor the measurements: 58 KB/txn → <4 KB (94% reduction); 30 MB/s → 1 MB/s WAL rate; 5× write throughput at 32 vCPU; p99 read latency −30% to −50%.
— Canonical wiki framing of Postgres's logical-replication stream as a general-purpose real-time pub/sub substrate. Nick Van Wiggeren (PlanetScale SVP Engineering, 2026-02-27) builds a bidirectional 15 fps video chat by inserting 25–40 KB JPEGs into a video_frames BYTEA column on a $5 PlanetScale Postgres and having a ~400-line Node.js WebSocket relay (pg-relay) consume a logical replication slot on the same database, forwarding each row's JPEG bytes to the recipient's browser. Canonicalises four new Postgres-primitive pages: (1) logical replication as pub/sub — verbatim "PostgreSQL's logical replication gives us a reliable and ordered change stream. You get INSERT, UPDATE, and DELETE events for every table in the publication, delivered in commit order. This means we don't have to poll Postgres with SELECT statements from the table fast enough to render 15fps video." Chat messages, user presence, and call-state transitions flow through the same relay + publication — one substrate, heterogeneous events. (2) LISTEN/NOTIFY 8 KB payload limit as the rejection reason for Postgres's other pub/sub primitive at media altitude: a 25–40 KB JPEG would chunk into 4–5 notifications, "at which point you've built a worse TCP on top of a notification system." (3) unlogged tables as a second-rejected alternative with a load-bearing mechanism argument: "logical replication reads from the WAL. If the table doesn't write to the WAL, it doesn't appear in the replication stream. To make this work, we'd have to fall back to polling." (4) TTL-based row deletion via a 2-second cron running DELETE FROM video_frames WHERE inserted_at < NOW() - INTERVAL '5 seconds' keeping the steady-state table at ~150 rows (5–7 seconds of frames, verified 15.2 fps actual). Operational numbers: 640×360 / JPEG 0.65 / 15 fps / ~375–600 KB/s per direction. Van Wiggeren's honest caveat: "Should you build real-time video over Postgres? No! Use WebRTC!" — the post is pedagogy on logical-replication's reach, not a production recommendation. Opens the database-as- real-time-message-broker + [[patterns/websocket-relay-over- logical-replication|WebSocket-relay-over-logical-replication]] pattern pair.
— Canonical wiki disclosure of the four-category Postgres memory accounting (active cache / inactive cache / RSS / memory-mapped) and the cache-vs-process grouping that determines reclaimability. Simeon Griggs (PlanetScale, 2026-03-30) canonicalises concepts/postgres-memory-accounting as the wiki's load-bearing primitive for interpreting Postgres memory dashboards — the single "80% memory used" number can represent a healthy warm-cache state or a memory-pressure state, and the operator's first diagnostic step is decomposition. Names five structural RSS growth drivers: work_mem multiplication (per-sort-node × per-query × per-backend), catalog bloat (table-per-tenant), memory-allocator fragmentation, misbehaving extensions, cached plans / prepared statements. Discloses the 1000× RAM vs NVMe latency ratio as justification for aggressive page-cache utilisation. Formalises patterns/triangulate-rss-growth-from-metrics — the four-signal workflow for per-query RSS attribution given that "RSS is a per-process metric, not a per-query metric." Companion piece to Dicken's sibling 2026-04-21 PgBouncer post: Dicken documents connection-count reduction, Griggs names the downstream RSS reduction as the memory consequence.
— Canonical reference for the Postgres hook extensibility surface. Patrick Reynolds (PlanetScale, 2026-03-23) names the 55 hooks available as of mainline-2026 (regex-counted via /^\S.*\w_hook = NULL/ over postgres/postgres), and gives the architectural framing: "A hook is a function that runs before, after, or instead of existing Postgres functionality. Want to observe or replace the planner? There's a hook for that. Want to examine queries as they execute? There are three hooks for that." Canonicalises concepts/postgres-hook as a first-class wiki concept with named examples (ExecutorRun, ProcessUtility, planner hook). Third canonical instance of the patterns/postgres-extension-over-fork pattern (after Aurora DSQL and Zalando PostGIS): Traffic Control ships as a Postgres extension without touching core, composing with pginsights via patterns/hook-colocation-for-zero-overhead.
— Zalando's 12-golden-signals methodology for AWS RDS Postgres health. Anchors five Postgres-altitude observability signals via AWS Performance Insights metric paths: db.Cache.blks_hit / (blks_hit + blk_read) (cache hit ratio, <80% = RAM pressure), db.IO.blk_read_time (block read latency, >10ms = SLO impact), db.Concurrency.deadlocks, db.Transactions.xact_commit, db.SQL.tup_fetched / db.SQL.tup_returned (SQL efficiency — low ratio signals query-design malpractice, not DB performance). Canonical wiki instance of the Postgres-engine-internal observability surface as consumed by a fleet-health CLI (systems/rds-health). See concepts/golden-signals-rds, concepts/cache-hit-ratio-memory-pressure, concepts/sql-efficiency-ratio. — Zalando's Postgres Operator team canonicalises the two foundational reasons for connection pooling in Postgres: (1) process- per-connection memory + context-switch overhead, and (2) GetSnapshotData O(N) complexity in the connection count, linking to Andres Freund's 2020 mailing-list thread that led to the Postgres 14 scalability fix. First wiki Seen-in specifically for the Postgres-on-Kubernetes deployment altitude; companion operator page: zalando postgres operator.
sources/2025-09-24-planetscale-processes-and-threads — Ben Dicken's 2025-09-24 pedagogical article canonicalises process-per- connection as Postgres's architectural choice, in explicit contrast to MySQL's [[patterns/thread-per- connection-database|thread-per-connection]] model. Verbatim: "Postgres is implemented with a process-per-connection architecture. Each time a client makes a connection, a new Postgres process is created on the server's operating system. There is a single 'main' process (PostMaster) that manages Postgres operations, and all new connections create a new Process that coordinates with PostMaster." The OS-substrate costs canonicalised: "Processes are heavy: there is memory overhead and a time overhead for managing them" — each Postgres backend carries ~5–10 MB steady-state per-connection memory plus the ~5 μs process-switch tax (vs ~1 μs thread-switch tax on MySQL). Canonicalises connection pooling as the universal mitigation at the OS-fundamentals altitude: "Connection poolers sit between clients and the database. All connections from the client are made to the pooler, which is designed to be able to handle thousands at a time. It maintains its own pool of direct connections to the database, typically between 5 and 50." First canonical wiki disclosure of the "5–50 direct connections fronting thousands of clients" pooler- sizing datum at the Postgres-substrate altitude. Complements

(PgBouncer as PlanetScale's pooling tier on top of Postgres) + [[sources/2026-04-21-figma-how-figmas-databases-team-lived- to-tell-the-scale]] (Figma's custom DBProxy tier built for the same per-backend-memory reason): Dicken canonicalises the OS-level why (process-per-connection cost) the ecosystem's pooler adoption mitigates.

— first wiki instance of Postgres as a production geospatial tile-server substrate. Zalando's Postgres Operator team (Nikolai Averkiev, 2021-12-01) documents a full maps stack where Postgres + PostGIS replaces what used to be a bespoke tile-pipeline middleware service — the pattern canonicalised as patterns/database-as-tile-server-middleware-replacement. Tile production lives inside the database via the PostGIS ST_AsMVT / ST_AsMVTGeom / ST_TileEnvelope / ST_HexagonGrid function family; a thin HTTP shim (systems/pg-tileserv) translates {z}/{x}/{y} URL requests into database queries and returns MVT (Mapbox Vector Tile) PBF blobs. Canonicalises Postgres's SQL-function tile layer capability: a PL/SQL function population_hexagons(z, x, y, step) dynamically generates a hex grid over the tile envelope, LATERAL-joins to a Eurostat 1 km² population grid, and returns an MVT-encoded heatmap whose hex size scales with zoom level. Deployment context: 2-replica Postgres-14 cluster managed by Zalando Postgres Operator, running the Spilo image with PostGIS pre-compiled, installed declaratively into a named geo schema via the CR's preparedDatabases.map_db.extensions.postgis: geo field. Also canonicalises patterns/postgres-extension-over-fork's first long-running third-party instance: PostGIS as an extension (since 2001), not a fork.
sources/2023-11-08-zalando-patching-the-postgresql-jdbc-driver — canonical wiki disclosure of Postgres's logical- replication KeepAlive protocol semantics and the WAL- reclamation coupling to subscriber acknowledgement. Zalando's 2023-11-08 post documents a runaway-WAL-growth failure mode in which low-traffic replication slots on busy shared-WAL Postgres primaries pin WAL indefinitely because subscribers never advance confirmed_flush_lsn in quiet periods. Canonical verbatim structural statement: "The WAL exists at the server level and we cannot break it down into a table-level or schema- level concept. All changes for all tables in all schemas in all databases on that server go into the same WAL." The post's contribution is an upstream fix to pgjdbc (PR #2941, shipped in pgjdbc 42.7.0) that has the driver respond to Postgres KeepAlive messages with an ack of the server- reported LSN when all seen Replication messages are flushed — concepts/keepalive-message-lsn-advancement. Canonical wiki introduction of the KeepAlive-message wire-protocol frame and its LSN-carrying role. Deployment context: Zalando's Postgres-sourced event-streaming platform running "hundreds" of Debezium Engine streams.
sources/2025-09-24-zalando-dead-ends-or-data-goldmines-ai-powered-postmortem-analysis — two new Postgres failure-mode disclosures surfaced by Zalando's postmortem-analysis pipeline, both inside an SRE-corpus mining context rather than a per-incident postmortem: (1) Postgres 12 AUTOVACUUM LAUNCHER race condition — "The incident was caused by a crash in the AUTOVACUUM LAUNCHER process due to a race condition, which in turn terminated all connections in the PostgreSQL database pool. This crash was attributed to [a known bug in PostgreSQL 12]." (2) Postgres 16 → 17 major-version upgrade triggering a logical replication memory-leak bug — "A major version upgrade of the Postgres database from version 16 to 17, which triggered a bug in Postgres' logical replication. It occurs when DDL commands are executed in parallel with a large number of transactions, leading to a memory leak." Both are disclosed as the only two technology-intrinsic datastore incidents in 5 years across Zalando's Postgres fleet — canonical wiki datum that Zalando's Postgres datastore is extremely mature and near-failure-free at the engine tier, and that logical replication (already canonicalised in the 2023-11 pgjdbc axis) remains the dominant engine-level failure surface even after engine-fleet maturation. Extends the Zalando Postgres axis on the wiki: 2020-06 (pgbouncer) + 2021-12 (PostGIS) + 2023-11 (pgjdbc / logical replication) + 2024-02 (12 Golden Signals) + 2025-09 (fleet-mining post- mortem analysis surfaces two more bug classes).
— Adnan Kukic's 2023-02-09 Postgres → MySQL audit. Positions Postgres as the source side of a cross-engine migration taxonomy: five Postgres-side features the target MySQL audit flags as source-specific — unconstrained VARCHAR, human-readable POINT, native UUID + gen_random_uuid() (concepts/uuid-string-representation-mysql), JSONB, TRUNCATE … CASCADE / RESTART IDENTITY and transactional TRUNCATE, procedural-language pluggability, and case-sensitive quoted identifiers. Also names the Postgres extension ecosystem as a migration-time audit axis with no MySQL equivalent. Worked instance of patterns/cross-engine-database-migration-audit. The post also surfaces Postgres's distinctive DDL features the comparison table names out loud: materialized views, INSTEAD OF triggers, no-unsigned-integer (the MySQL side has it, Postgres does not).
sources/2026-04-21-planetscale-graceful-degradation-in-postgres — Postgres-on-PlanetScale as the graceful-degradation substrate under traffic spike. Ben Dicken reframes Traffic Control (a Postgres extension feature on PlanetScale Postgres) as the user-facing degradation lever: classify queries by critical / important / best-effort priority via SQLCommenter tags, apply per-tier resource budgets, shed the lowest-priority tier live under spike. Canonicalises [PGINSIGHTS] Traffic Control: as an in-band warning channel returned inside the Postgres query response — a managed-Postgres diagnostic surface delivered at the extension tier. Canonicalises the warn → enforce budget- tuning lifecycle. Extends concepts/graceful-degradation from a Netflix-centric "cache / fallback / default" framing to a database-tier "shed lowest-priority via budget-config change" framing — the mechanism has no upstream Postgres equivalent.
— Canonical new wiki instance of a Postgres extension as the observability substrate for application-layer query tagging. PlanetScale's Insights extension parses SQLCommenter- style tags from query SQL (no-op at the execution layer, observable to the extension), emits them over a dual- stream Kafka pipeline (patterns/dual-stream-telemetry-pipeline) to ClickHouse, and adds three auto-populated tags at the extension level: application_name (set by the Postgres driver), username (Postgres user), remote_address (remote IP). Canonicalises the Postgres extension API as a viable place to implement per-query-pattern aggregate statistics indexed by SQLCommenter tag — a capability upstream pg_stat_statements does not provide. Companion to sources/2026-04-21-planetscale-ai-powered-postgres-index-suggestions (same extension, different capability: AI index suggestions) and (Traffic Control — SQLCommenter-tag-based workload-class enforcement). Three Insights capabilities compose on one extension substrate.
sources/2026-04-21-planetscale-benchmarking-postgres — canonical wiki instance of vendor-published Postgres benchmarking methodology. PlanetScale discloses its Telescope harness and commits to three public benchmarks (latency, TPCC at 500 GB via Percona sysbench-tpcc, OLTP read-only at 300 GB via sysbench) against Amazon Aurora, Google AlloyDB, CrunchyData, Supabase, TigerData, and Neon — "all of the products we compared with use network-attached storage for the underlying drives." Reference target is an i8g M-320 on Metal with primary + 2 replicas across 3 AZs. Canonical wiki instance of the new concepts/price-performance-ratio concept (the fourth benchmark question: "How much does it cost to achieve some bar of performance?") and the new patterns/reproducible-benchmark-publication pattern (reproduction instructions + benchmarks@planetscale.com feedback address). Methodology-voice acknowledgment of bias limits: "no single benchmark can capture the performance characteristics of all such databases… You cannot look at a benchmark and know for sure that your workload will perform the same given all other factors are the same." Companion to the 2025-10-14 Postgres 17 vs 18 benchmark which this methodology underlies.
sources/2025-05-27-allthingsdistributed-aurora-dsql-rust-journey — Aurora DSQL uses Postgres via extensions rather than forking it; extensions written in Rust for concepts/memory-safety.
sources/2026-04-20-databricks-take-control-customer-managed-keys-for-lakebase-postgres — systems/lakebase is Databricks' Postgres-compatible serverless OLTP service descended from Neon; externalises page and WAL storage into systems/pageserver-safekeeper for compute/storage separation. Different structural answer to DSQL's "extend, don't fork" idiom for the same "make Postgres scale-to-zero" problem.
sources/2026-03-23-datadog-debugging-postgres-upsert-wal — Datadog diagnoses a 2× IOPS / 4× WAL-sync regression on a host_last_ingested upsert table that was deliberately engineered for HOT updates (dedicated narrow table, last_ingested unindexed, fillfactor=80). Using the Postgres 15+ pg_walinspect extension plus an lldb breakpoint on WALInsertLockAcquire, they prove that INSERT ... ON CONFLICT DO UPDATE locks the conflicting row before evaluating the WHERE clause — the lock mutates tuple metadata with a transaction ID, the xid forces a Transaction COMMIT WAL record, and the COMMIT forces an fsync. At the planned 25,000 upserts/sec scale this would exceed the cluster's single-writer fsync budget (~1,000 8-KiB fsyncs/sec on gp3 EBS per pg_test_fsync). Fix: rewrite as a patterns/cte-emulated-upsert (WITH insert_attempt AS (INSERT ... ON CONFLICT DO NOTHING RETURNING ...) UPDATE ... WHERE ... AND NOT EXISTS (SELECT FROM insert_attempt)) — no implicit lock, the common "row exists and is fresh" path emits zero WAL records. Trade-off: a small concurrent-delete race between the insert attempt and the update, accepted because host-liveness tracking is inherently imprecise. Canonical worked example of concepts/wal-write-ahead-logging as both the cost ceiling and the observability surface.
sources/2025-05-03-aws-postgresql-transaction-visibility-read-replicas — AWS's response to Jepsen's 2025-04-29 RDS-Postgres Multi-AZ analysis disclosing that community Postgres's visibility order can diverge from its WAL commit order (the commit path writes the WAL record then asynchronously removes the xid from the in-memory ProcArray), admitting the Long Fork anomaly — a violation of Snapshot Isolation's atomic-visibility property in which two readers on different nodes can observe concurrent non-conflicting transactions in different orders. Affects all Postgres isolation levels (Read Committed / Repeatable Read / Serializable) because all three take snapshots against ProcArray. Absent in Single-AZ Postgres, systems/aurora-limitless, and systems/aurora-dsql (both of which replace ProcArray-based visibility with time-based MVCC via Postgres-extension surgery — see patterns/postgres-extension-over-fork). Known on pgsql-hackers since 2013. Proposed upstream fix is Commit Sequence Numbers — stamp each commit with a monotonic CSN so snapshot-by-CSN-watermark makes visibility order = commit order; multi-patch effort, PGConf.EU 2024 talk, AWS PostgreSQL Contributors Team participating. See concepts/visibility-order-vs-commit-order for the generalized framing. Also surfaces the ProcArray-scan CPU cost — measurable fraction of CPU at thousands of connections on large Postgres servers in read-heavy workloads, so the fix is both a correctness and a perf win. Load-bearing against five classes of enterprise capability: distributed-SQL consistent snapshots, read-write splitting, snapshot-then-replay data sync, PITR-to-LSN, and tuple-xid replacement by logical/clock-based commit time.
sources/2026-04-21-figma-how-figmas-databases-team-lived-to-tell-the-scale — Figma's in-house concepts/horizontal-sharding effort on top of RDS Postgres (2022→). Post names three production Postgres ceilings driving the horizontal-sharding decision: (1) vacuums at several-TB / billions-of-rows tables producing user-visible reliability impact (vacuums are essential for avoiding transaction-ID exhaustion — the fundamental cost of MVCC at scale); (2) per-table write rates approaching RDS's per-instance IOPS ceiling; (3) CPU on the hottest vertical partition. Demonstrates that Postgres views are a viable logical-shard representation against a still-unsharded physical instance (patterns/sharded-views-over-unsharded-db) with validated <10% worst-case overhead vs direct-table access — the primitive that lets horizontal-shard semantics be feature-flag rolled out + rolled back in seconds before the riskier physical 1→N failover. Shipped first horizontally-sharded table September 2023; 10 seconds partial primary availability, no replica impact. DBProxy is the Go router + query engine that implements the sharded-query subset. An alternative to systems/aurora-limitless and systems/aurora-dsql's "extend Postgres via extensions to make it scale" approach — Figma instead extends Postgres via views + an external router, keeping community Postgres unmodified. Canonical companion to patterns/postgres-extension-over-fork on the same underlying substrate question.
sources/2026-04-21-figma-keeping-it-100x-with-real-time-data-at-scale — Figma's LiveGraph real-time data-fetching service taps the Postgres WAL logical replication stream as its CDC source — the same stream that keeps replicas up-to-date. LiveGraph's old one-server-in-memory-mutation-based-cache design required a globally-ordered replication stream (reasonable for a single primary Postgres); under vertical partitioning the order broke and a stopgap artificially combined per-shard streams, and the design couldn't survive horizontal sharding via DBProxy. LiveGraph 100x decouples by sharding the stateless invalidator the same way as the physical DBs and letting each shard's invalidator tail its own WAL stream independently — a canonical example of how moving from a single-primary to a sharded-Postgres world forces downstream services to drop global-order assumptions baked into the WAL-consumer contract.
sources/2025-10-12-mongodb-cars24-improves-search-for-300-million-users-with-atlas — named explicitly as Cars24's legacy RDBMS "for managing and searching data" before the migration to MongoDB Atlas. Cars24's shape was the canonical Postgres + separate bolt-on search engine + sync pipeline topology the synchronization-tax concept names; patterns/consolidate-database-and-search captures the remediation Cars24 chose. Not a knock on Postgres itself — Postgres has full-text search ([tsvector] / GIN) and pgvector that implement the same consolidation in-place; Cars24 chose vendor-swap rather than in-place consolidation.
sources/2025-11-04-datadog-replication-redefined-multi-tenant-cdc-platform — Postgres as the CDC source of a managed multi-tenant replication platform. The source-side 7-step runbook (wal_level=logical, replication-permissioned users, publications + slots, Debezium instances, Kafka topics, heartbeat tables for slot LSN advancement, sink connectors) is the canonical wiki operational reference for Postgres logical replication as a CDC source. Motivating incident: a Metrics Summary page joining 82K × 817K rows on a shared Postgres hit p90 ~7 s, resolved by rerouting search + faceted filtering to a dedicated search engine populated by Postgres → Debezium → Kafka → sink-connector replication (lag ~500 ms, page-load ~30 s → ~1 s). Canonical instance of the split answer to the database-and-search problem (patterns/debezium-kafka-connect-cdc-pipeline) vs Cars24's consolidate answer via Atlas Search.
sources/2026-04-16-cloudflare-deploy-postgres-and-mysql-databases-with-planetscale-workers — Postgres as the target wire protocol for the Cloudflare × PlanetScale integration. Workers open a standard pg client against env.DATABASE.connectionString (the binding exposed by systems/hyperdrive), which in turn pools + caches against a PlanetScale Postgres origin. No Postgres-specific adaptation in Workers — the binding preserves the Postgres wire protocol end-to-end, so any Postgres client / ORM works unchanged. Motivates the edge-compute-against-central-SQL latency hazard captured in concepts/edge-to-origin-database-latency and resolved via patterns/explicit-placement-hint. Ancillary: pgvector called out explicitly as the AI-workload reason Postgres specifically matters for Workers apps.
— Postgres as the engine under a new managed-vendor entrant. PlanetScale's launch announcement for PlanetScale for Postgres (private preview 2025-07-01, later GA) runs real community Postgres v17 under a proprietary operator — not a fork, not a Postgres-extension rewrite like Aurora DSQL, not a compute-storage-separation rewrite like Lakebase / Neon. The stock-Postgres + proprietary-proxy + direct-attached-NVMe shape is PlanetScale's structural answer to "how do you scale managed Postgres beyond vanilla RDS?" Discloses online imports from any Postgres v13+ + automatic zero-downtime Postgres version updates — first canonical wiki instance of a vendor-boundary Postgres migration capability as an advertised feature. PgBouncer named as the pooling engine inside the proprietary proxy layer (first canonical wiki citation of PgBouncer). Rejects Postgres's industry default of network-attached block storage — Metal extends to Postgres: "PlanetScale Metal's locally-attached NVMe SSD drives fundamentally change the performance/cost ratio for hosting relational databases in the cloud. We're excited to bring this performance to Postgres." Names Neki as the upcoming from-first-principles Postgres sharding layer (rejecting Vitess: "Vitess' achievements are enabled by leveraging MySQL's strengths and engineering around its weaknesses") — canonicalised as patterns/architect-sharding-from-first-principles-per-engine.
sources/2025-07-08-planetscale-caching — Ben Dicken canonicalises Postgres's two-layer caching strategy as the exemplar of database-tier caching that leans on the OS: its own shared_buffers pool for frequently-accessed pages (typically sized to ~25% of RAM) + the kernel's filesystem page cache for everything else. Canonical double-buffering framing (same disk page present in both tiers) + canonical 25%-of-RAM sizing rule. See concepts/postgres-shared-buffers-double-buffering. Positions Postgres against MySQL's single-layer InnoDB-buffer-pool design (commonly run with O_DIRECT bypassing the OS page cache) as two valid philosophical approaches to the same engineering problem. Also flags ACID-aware caching as the reason database buffer pools are structurally harder than generic LRU caches — "both databases have to take careful measures to ensure these pages contain accurate information and metadata as the data evolves".
— Postgres 18 async-I/O benchmarked against Postgres 17. Ben Dicken's 96-run sysbench oltp_read_only sweep across Postgres 17 and Postgres 18's three io_method modes (sync, worker, io_uring) on four EC2 configurations (3× r7i.2xlarge on gp3 3k / gp3 10k / io2 16k + one i7i.2xlarge with local NVMe). Canonicalises Postgres 18's io_method feature — the single most-discussed change in the September 2025 release. Key counter-intuitive finding: io_uring does not dominate. On network-attached storage at low concurrency, Postgres 18 sync and worker beat Postgres 17 and Postgres 18 io_uring; io_uring only slightly beats the other options at 50 connections + --range_size=10000 on the local-NVMe i7i. Explanation cites Tomas Vondra's companion tuning post: (1) index scans don't yet use AIO — B-tree-dominated OLTP doesn't benefit; (2) checksums + memcpy remain synchronous — CPU cap regardless of I/O speed; (3) workers distributes I/O across processes where io_uring keeps everything in one. Canonicalises patterns/background-worker-pool-for-async-io as the deliberate Postgres 18 default and concepts/async-io-concurrency-threshold as the observed phenomenon. Local NVMe wins every scenario — empirical backing for the 2025-03-13 IO-devices post's [[concepts/network-attached-storage-latency-penalty|5× EBS latency hop]] argument. Price-performance winner is the i7i.2xlarge at $551.15/mo (1.8 TB, no IOPS cap) versus r7i + gp3-3k at $442.32, r7i + gp3-10k at $492.32, and r7i + io2-16k at $1,513.82. Positions PlanetScale Metal structurally as the architectural winner on the shipped Postgres 18 stack. Caveats explicitly recorded: read-only workload, fixed tuning, no p99 reporting, oltp_read_only is not a production OLTP proxy.
— PlanetScale's Simeon Griggs reproduces Brandur Leach's 2015 Postgres-queue degradation benchmark on Postgres 18 and shows the underlying MVCC-horizon failure mode is unchanged: FOR UPDATE SKIP LOCKED + B-tree bottom-up deletion lift the floor (modern Postgres survives the 2015 test bench much longer than 15 minutes) but do not fix the ceiling (dead tuples accumulate linearly when long or continuously-overlapping transactions pin the horizon). Establishes canonical wiki treatment of the other half of Postgres MVCC beyond the HOT-updates half: dead tuples + VACUUM, autovacuum, the MVCC horizon, SKIP LOCKED, and queue tables as the canonical failure-sensitive workload shape. Key claim upstream Postgres cannot fix: timeouts (statement_timeout PG 7.3+, idle_in_transaction_session_timeout PG 9.6+, transaction_timeout PG 17+) target single-query duration and cannot limit workload-class concurrency — three continuously overlapping 40-second analytics queries pin the horizon indefinitely with no timeout violations. Structural answer: a workload-class resource budget — PlanetScale's Traffic Control is the canonical wiki instance. Measured: 800 jobs/sec + 3 overlapping analytics, Traffic Control disabled = 155k backlog / 300+ ms lock time / 383k dead tuples / death spiral; enabled (action=analytics cap = 1 worker) = 0 backlog / 2 ms lock time / dead tuples cycling 0–23k / 15 analytics queries completed / completely stable.
sources/2026-04-21-planetscale-ai-powered-postgres-index-suggestions — PlanetScale's Rafer Hazen canonicalises two Postgres-specific primitives as LLM-safety machinery for database tuning: HypoPG (the community extension that registers hypothetical indexes the planner can use in EXPLAIN without any on-disk or write-path cost) and the planner's EXPLAIN cost estimate (used as the ground-truth oracle that decides whether a given candidate index is worth surfacing). Insights' per-query-pattern workload telemetry (rows read, rows returned, runtime share, frequency) narrows the candidate query set before the LLM sees it (see concepts/index-candidate-filtering); the LLM proposes CREATE INDEX statements over that narrowed workload + the minimum relevant schema; HypoPG + EXPLAIN filters the output (see concepts/hypothetical-index-evaluation). First canonical wiki instance of the pattern pair patterns/workload-aware-llm-prompting + patterns/llm-plus-planner-validation applied to Postgres DDL suggestions, and first wiki citation of HypoPG. Generalises: the same planner-oracle validation shape works for any LLM-generated database change where a hypothetical-evaluation primitive exists for that change type.
sources/2026-04-21-planetscale-postgres-high-availability-with-cdc — Canonical wiki disclosure of Postgres's HA-vs-CDC design asymmetry vs MySQL. Sam Lambert (PlanetScale CEO, 2025-09-12) names the logical replication slot as the primary-local catalog object that pins WAL via restart_lsn + confirmed_flush_lsn and must be coordinated across the cluster at failover time. Postgres 17 failover slots serialise slot metadata into WAL so standbys can mirror it, but preserve an eligibility gate: the subscriber must have been observed advancing the slot while the standby was following. The gate exists to preserve exactly-once CDC semantics at the expense of HA flexibility. Three canonical failure scenarios: (1) quiet- period CDC subscriber leaves all standby slots ineligible; forced failover breaks the stream. (2) Replica replacement via pg_basebackup — new standbys start at a conservative LSN and remain ineligible until the next subscriber poll (6 hours in the worked example). (3) Dormant physical standby with a physical slot pins restart_lsn indefinitely and can fill the primary's WAL volume. Canonical wiki concept pages created by this ingest: concepts/postgres-wal-level-logical (the wal_level setting that enables logical decoding + the primary-local- catalog-vs-WAL-stream substrate asymmetry), concepts/postgres-logical-replication-slot, concepts/postgres-failover-slot, and concepts/ha-cdc-coupling (the operational coupling between HA actions and CDC subscriber behaviour). Canonical new pattern: patterns/action-log-vs-state-log-replication — Postgres's state-log-with-primary-local-progress shape vs MySQL's action-log-with-consumer-local-GTID shape as two ends of a design-space choice on where consumer progress lives. Complements Datadog's 2025-11-04 managed-CDC-platform post (which runs Postgres logical replication at scale with first-party subscribers) — that post is about running Postgres CDC well; this post is about the structural trade- off when subscribers are outside operator control.