Skip to content

PATTERN Cited by 1 source

Database as real-time message broker

Problem

An application needs to fan out heterogeneous real-time events (chat, presence, call state transitions, media frames, notifications) to many connected clients. The default answer is to stand up a dedicated broker — Kafka, NATS, Redis pub/sub, or a custom WebSocket gateway backed by an in-memory fan-out table.

That adds operational surface:

  • Another deployable to run, monitor, upgrade, secure.
  • A second durability / ordering / consistency model to reason about next to the application database.
  • A dual-write or event-capture layer to keep the broker in sync with the database (outbox pattern, CDC, dual-write-with-txn).

If the events are naturally table-row-shaped and volumes are moderate, all that machinery may be avoidable.

Solution

Use the application's Postgres database as the pub/sub broker by:

  1. Writing events as row inserts into dedicated tables in a PUBLICATION.
  2. Running a small relay process that holds a logical replication slot on that publication and forwards each row-event to the appropriate end client(s).
  3. Letting the write path be the publish path — one INSERT by the application server is the act of publishing.

See concepts/logical-replication-as-pubsub for the primitive and patterns/websocket-relay-over-logical-replication for the relay half.

Mechanism

  • Publication covers all event-carrying tables:
CREATE PUBLICATION realtime_events
    FOR TABLE chat_messages, presence, call_state,
              video_frames, audio_frames;
  • Relay process subscribes to the replication slot and routes each delivered row based on a to_id / channel_id / session_id column that's part of the row itself.

  • Heterogeneous events use different tables but share the relay + replication slot. One substrate for all event types. (Source: sources/2026-02-27-planetscale-video-conferencing-with-postgres: "the same mechanism that pushes video frames to call participants also pushes chat messages, user presence changes, and call state transitions.")

  • Retention is bounded via TTL- delete cron where the event stream is ephemeral.

When this pattern fits

  • You already run Postgres and don't want a second broker on the critical path.
  • Event volume is moderate (~thousands/s, not ~millions/s).
  • Each event is reasonably small on a row — sub-megabyte payloads in BYTEA are fine; multi-megabyte payloads should probably live in a blob store with a pointer in the row.
  • You want at-least-once, ordered delivery to subscribers that can acknowledge progress.
  • You want each event to be durable by default — the WAL record that published it is also what makes it crash-safe.

When it doesn't

  • Write volume saturates Postgres before it saturates a broker. Dedicated brokers are designed for commit-heavy small-record workloads; Postgres is not.
  • Event retention is in-memory-fast only — you don't want events to touch disk. Postgres can't do this because logical replication reads the WAL (see concepts/unlogged-table-postgres).
  • You need multi-datacenter ring topology or sophisticated consumer-group rebalancing — Postgres's logical replication is single-primary-to-N-subscribers and has no built-in consumer group semantics.
  • LISTEN / NOTIFY fits better and the payloads are small. The related pattern is to use NOTIFY directly and skip the logical- replication relay — simpler where the [[concepts/listen-notify- payload-limit|8 KB limit]] isn't a problem.

Failure modes to consider

  • Slow subscriber → WAL growth. A replication slot that stops advancing holds WAL back indefinitely; on-call cost is not just "the subscriber lags" but "disk fills up on the primary."
  • Schema change disciplines get stricter. Publications carry row payloads; a column type change or rename can break downstream parsers. The relay needs to be updated in lockstep with schema changes.
  • Relay is a single point of failure unless you run it redundantly. Multiple slots are fine (Postgres supports many), but two relays reading the same slot isn't safe. Typical shape is one relay + stand-by or one slot per relay + application-level de-duplication.
  • Backpressure is real. If subscribers can't keep up with inserts, the WAL grows and primary inserts eventually slow / fail. Canonical stream-processing concern; see concepts/backpressure.

Canonical wiki instance

sources/2026-02-27-planetscale-video-conferencing-with-postgres: Nick Van Wiggeren (PlanetScale) builds a bidirectional 15 fps video chat where a pg-relay Node.js WebSocket server holds a logical replication slot on a $5 PlanetScale Postgres and forwards JPEG frames, audio frames, chat, and presence events to connected clients — all on one transport. He explicitly invokes the pattern as the pedagogy point: "if we wanted, we could keep the frames around. Each JPEG is being durably persisted to PostgreSQL, crash-safe, replicatable and ready for querying later on." The post is deliberately a demo of the pattern's reach, not a recommendation to replace WebRTC — Van Wiggeren's closing caveat is "Should you build real-time video over Postgres? No! Use WebRTC!" — but the heterogeneous-events-on-one-substrate property demonstrated is exactly what this pattern exists to capture.

Last updated · 550 distilled / 1,221 read