Skip to content

PLANETSCALE 2026-02-27

Read original ↗

PlanetScale — Video Conferencing with Postgres

Summary

Nick Van Wiggeren (PlanetScale SVP Engineering) builds a bidirectional 15 fps video chat on top of $5 PlanetScale PostgreSQL as a hack response to a SpacetimeDB tweet claiming "the world's first video call over a database." The architecture is a SvelteKit frontend + a small Node.js WebSocket relay (pg-relay, ~400 lines of TypeScript) + PostgreSQL. Browser captures a camera frame, encodes it as JPEG, and sends it as a binary WebSocket message to pg-relay, which INSERTs it into a video_frames table (jpeg BYTEA column). pg- relay is simultaneously running a logical-replication consumer on the same database; it sees the new row appear in the replication stream, checks the to_id, and forwards the raw JPEG bytes over WebSocket to the recipient. Audio flows through a parallel audio_frames table the same way. The load-bearing architectural claim is that PostgreSQL's logical replication is itself a reliable, ordered change stream — a database with logical replication is already a pub/sub system, so you don't need a separate broker for real-time fanout of chat, user presence, call state transitions, or video frames. The post also canonicalises two rejected alternatives: LISTEN / NOTIFY with its 8 KB payload limit (would force chunking a 25–40 KB JPEG into 4–5 notifications and reassembling them) and unlogged tables (faster inserts but no WAL → no logical replication → fall back to SELECT polling). Runs at 640×360 / JPEG quality 0.65 / 15 fps / ~375–600 KB/s per direction, and keeps only ~5 s of frames via a cron DELETE FROM … WHERE inserted_at < NOW() - INTERVAL '5 seconds'. "Should you build real-time video over Postgres? No! Use WebRTC!" — but the post deliberately demonstrates how far logical replication reaches as a general-purpose real-time substrate.

Key takeaways

  1. Logical replication is a reliable, ordered change stream you already have. Verbatim: "PostgreSQL's logical replication gives us a reliable and ordered change stream. You get INSERT, UPDATE, and DELETE events for every table in the publication, delivered in commit order. This means we don't have to poll Postgres with SELECT statements from the table fast enough to render 15fps video." Canonical wiki statement of logical replication as a pub/sub substrate — same primitive that Debezium uses for CDC is reused here as the fan-out substrate for real-time application events (chat, presence, call state and video frames). (Source: sources/2026-02-27-planetscale-video-conferencing-with-postgres)

  2. Multiple consumer types share one substrate. Van Wiggeren: "This means that the same mechanism that pushes video frames to call participants also pushes chat messages, user presence changes, and call state transitions. When someone sends a chat message, it gets INSERTed, appears in the replication stream, and gets forwarded to every connected client. When a user disconnects, their row gets DELETEd, and everyone sees them go offline." Canonicalises the downstream property of patterns/database-as-realtime-message-broker: heterogeneous real-time events unified on one transport. (Source: sources/2026-02-27-planetscale-video-conferencing-with-postgres)

  3. LISTEN / NOTIFY's 8 KB payload limit rules it out for media. Van Wiggeren considered it first: "LISTEN/NOTIFY was my first idea. Postgres has a built-in pub/sub mechanism — NOTIFY on a channel, LISTEN on the other end, messages arrive in real time." Then the gotcha: "The problem is an enforced 8KB payload limit. A video frame at 640x360 is 25-40KB. We'd have to chunk every frame into 4-5 separate notifications, reassemble them on the other end, handle ordering, handle dropped chunks — at which point you've built a worse TCP on top of a notification system." Canonicalises concepts/listen-notify-payload-limit as the reason LISTEN/NOTIFY fails at media altitude. (Source: sources/2026-02-27-planetscale-video-conferencing-with-postgres)

  4. Unlogged tables skip WAL → skip logical replication → back to polling. Van Wiggeren's second rejected alternative: "Unlogged tables go the other direction. Instead of changing how we get data out of Postgres, they change how data goes in. Unlogged tables skip the WAL entirely — no write-ahead logging, no fsync, no crash recovery. Inserts are faster because Postgres isn't making durability guarantees about video frames. I didn't like that because logical replication reads from the WAL. If the table doesn't write to the WAL, it doesn't appear in the replication stream. To make this work, we'd have to fall back to polling — SELECT * FROM video_frames WHERE seq > $1 in a loop. This might have worked fine, maybe better — but something about rendering video from a polling loop of SELECT * didn't feel good." Canonical trade-off articulation for concepts/unlogged-table-postgres: you get write throughput but you lose the logical-replication fan-out path because the WAL is where the stream lives. (Source: sources/2026-02-27-planetscale-video-conferencing-with-postgres)

  5. Binary media lives in BYTEA with modest egress. Schema verbatim:

CREATE TABLE video_frames (
    id          BIGSERIAL PRIMARY KEY,
    session_id  UUID      NOT NULL,
    from_id     TEXT      NOT NULL,
    to_id       TEXT      NOT NULL,
    seq         INT       NOT NULL,
    width       SMALLINT  NOT NULL,
    height      SMALLINT  NOT NULL,
    jpeg        BYTEA     NOT NULL,
    inserted_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

"There's nothing special about this table. It's just rows with a JPEG in a BYTEA column. This incurs a modest amount of egress, but nothing a database like this can't handle." The JPEG travels as bytes through Postgres unchanged — no codec, no transcoding, just durable persistence of an opaque payload. (Source: sources/2026-02-27-planetscale-video-conferencing-with-postgres)

  1. TTL via cron DELETE with an interval predicate bounds the table size. Van Wiggeren: "I didn't want to keep every video frame forever. At 15fps, you'd accumulate about 108,000 rows/ hour per active call. So there's a cleanup job that runs every 2 seconds and prunes frames older than 5 seconds:"
DELETE FROM audio_frames WHERE inserted_at < NOW() - INTERVAL '5 seconds';
DELETE FROM video_frames WHERE inserted_at < NOW() - INTERVAL '5 seconds';

Verified at steady state: "for every call, we'd expect to have about 5-7 seconds of frames in the table at any given time, or about 150 rows total." Canonicalises concepts/ttl-based-row-deletion as a simpler alternative to dedicated TTL features when the schema is mutation-heavy but short-lived. (Source: sources/2026-02-27-planetscale-video-conferencing-with-postgres)

  1. Frames are durable even though retention is seconds. "What's really cool about this is that if we wanted, we could keep the frames around. Each JPEG is being durably persisted to PostgreSQL, crash-safe, replicatable and ready for querying later on. My $5 PostgreSQL has the throughput to store hours of video that I can combine later." The logical-replication path composes with the at-rest-durability property for free — the same WAL entry that fed the replication stream is also what makes the row crash-safe. (Source: sources/2026-02-27-planetscale-video-conferencing-with-postgres)

  2. Audio / video rendering are asymmetric on the client. Video frames render instantly (swap an <img>'s blob-URL src); audio must be scheduled ahead of time on a Web Audio AudioBufferSourceNode with a small jitter buffer. Keeping them in sync required clamping the audio scheduling buffer so it couldn't drift more than ~150 ms ahead of real time:

const now = audioCtx.currentTime;
const clamped = nextPlayTime > now + 0.15 ? now + 0.02 : nextPlayTime;
const startAt = Math.max(clamped, now + 0.02);

Not a system-design primitive but a load-bearing caveat for anyone building media pipelines on top of a generic transport: audio alignment is an application concern the database can't help with. (Source: sources/2026-02-27-planetscale-video-conferencing-with-postgres)

  1. Explicit "don't do this" caveat. Van Wiggeren closes with "Should you build real-time video over Postgres? No! Use WebRTC! But if you want to understand how logical replication works and want to see how far you can push Postgres as a general-purpose real-time backend, this is a fun way to find out." Canonical honest framing: the post is pedagogy about the reach of logical replication, not a recommendation to replace WebRTC or a dedicated pub/sub broker for this shape. (Source: sources/2026-02-27-planetscale-video-conferencing-with-postgres)

Systems / concepts / patterns extracted

Systems: PostgreSQL (logical replication + BYTEA + WAL as pub/sub substrate); PlanetScale for Postgres ($5 managed Postgres is the hosted backend); SvelteKit (frontend, same one SpacetimeDB used; reused unmodified); WebRTC (named only as the tool Van Wiggeren explicitly tells you to use instead — negative reference).

Concepts (new): logical replication as pub/sub, [[concepts/listen-notify-payload- limit|LISTEN/NOTIFY 8 KB payload limit]], unlogged tables, TTL-based row deletion. Concepts (existing, extended): concepts/logical-replication, concepts/postgres-logical-replication-slot, concepts/wal-write-ahead-logging, concepts/change-data-capture.

Patterns (new): database as real-time message broker, WebSocket relay over logical replication.

Operational numbers

  • Relay server: ~400 lines of TypeScript total.
  • Video frame: 640×360 resolution, JPEG quality 0.65, ~25–40 KB per frame.
  • Video frame rate: 15 fps bidirectional.
  • Data rate: ~375–600 KB/s per direction of video through Postgres.
  • Audio: PCM, resampled to 16 kHz mono, encoded as 16-bit little-endian (PCM16LE) samples.
  • Frame accumulation rate at 15 fps: ~108,000 rows/hour per active call if uncapped.
  • Cleanup cadence: cron runs every 2 seconds, retains the last 5 seconds of frames.
  • Steady-state row count per call: ~5–7 seconds of frames ≈ ~150 rows total at any given time (two participants, 15 fps each).
  • Verified per-participant: 76 frames in the last 5 s → 15.2 fps actual (PlanetScale shell output).
  • Audio scheduling clamp: buffer can't drift more than 150 ms ahead of audioCtx.currentTime.
  • Database tier: PlanetScale $5/mo PostgreSQL (PS-5 starter tier).
  • LISTEN/NOTIFY payload limit: 8 KB (Postgres-enforced) — the rejected alternative.

Caveats

  • Hack / demo post, not a production architecture. Van Wiggeren explicitly tells readers to use WebRTC instead. Scope tests on pedagogy + mechanism disclosure, not recommended-architecture framing.
  • Tier 3 — PlanetScale blog, but Van Wiggeren (SVP Engineering) byline is default-include per PlanetScale skip rules for PlanetScale-engineering-voice posts. Architecture density ~55% of body (the other ~45% is the SpacetimeDB framing, the browser media-capture walkthrough, the playful close); the logical- replication-as-pub/sub framing + LISTEN/NOTIFY 8 KB rejection
  • unlogged-tables rejection + TTL-delete + BYTEA schema are the load-bearing datapoints.
  • Two-participant demo only. No scaling analysis of the replication-stream fan-out to N subscribers. Van Wiggeren notes "My \$5 PostgreSQL has the throughput to store hours of video that I can combine later" but does not benchmark higher concurrency.
  • No WebRTC comparison on latency, resource cost, or quality. The post deliberately avoids this because the point is "how far you can push Postgres," not a head-to-head.
  • No disclosure of logical-replication lag (author-to-subscriber propagation latency) under the 375–600 KB/s insert load — only the qualitative "latency during video call … exceeded my expectations" chart callout.
  • pg-relay is not a named shipping product. It's the demo relay binary, open-sourced at github.com/nickvanw/PgVideoChat.

Scope disposition

Tier-3 on-scope as a hack-style mechanism-disclosure post from PlanetScale's SVP Engineering. Passes on (a) canonical logical- replication-as-pub/sub framing — there is no prior wiki page canonicalising this primitive, and the existing concepts/logical-replication page is CDC-pipeline-focused; (b) LISTEN/NOTIFY 8 KB payload limit canonicalised with the quoted worked example of why it fails at media altitude; (c) unlogged-tables trade-off canonicalised with the exact mechanism (WAL skip ⇒ replication-stream invisible ⇒ must poll); (d) BYTEA + TTL-delete + logical-replication composition shown end-to-end. The post's playful register doesn't diminish the technical density. Four new canonical concept pages + two new pattern pages + four existing concept extensions justified.

Source

Last updated · 550 distilled / 1,221 read