Skip to content

CONCEPT Cited by 1 source

Edge-to-origin database latency

Definition

Edge-to-origin database latency is the extra per-query latency that accrues when a serverless / edge compute unit runs far from a centrally-located transactional database and issues SQL round-trips over the wide-area network. Each SQL round-trip costs one network RTT between the edge POP and the DB region; a request issuing N queries pays N × RTT before any database work happens.

Motivating framing from Cloudflare's 2026-04-16 Workers × PlanetScale post: "By default, Workers execute closest to a user request, which adds network latency when querying a central database especially for multiple queries. Instead, you can configure your Worker to execute in the closest Cloudflare data center to your PlanetScale database." (Source: sources/2026-04-16-cloudflare-deploy-postgres-and-mysql-databases-with-planetscale-workers.)

The tension being resolved

Edge runtimes exist because proximity to the user cuts user-perceived latency. But transactional relational databases are typically centralised — you cannot shard a Postgres primary across every CDN POP. So the edge's default "run closest to the user" routing trades user-network RTT for database-network RTT. For SQL-heavy request paths, the trade often goes the wrong way: the request's critical path is dominated by the DB, not the static asset fetch, and running user-adjacent but DB-far makes everything slower.

Levers available to shrink it

Two orthogonal levers, both named explicitly in the 2026-04-16 post:

  1. Reduce the number of RTTs. A caching proxy tier in front of the origin (e.g. Hyperdrive's query caching) collapses repeated reads to zero RTTs. Pooled connections shortcut TLS + handshake cost on the RTTs that remain. This is a caching-proxy-tier move applied at the database layer.
  2. Reduce the cost of each RTT. Collapse the physical distance between compute and DB by pinning the edge worker to a POP in (or adjacent to) the DB's region via an explicit placement hint. aws:us-east-1 placement paired with a PlanetScale DB in us-east-1 makes the RTT ~within-datacentre rather than cross-continent.

The Cloudflare-named target: "single digit milliseconds" for multi-query request paths once placement is auto-set from the DB's region (forward-looking goal from the 2026-04-16 post).

Why this matters for agents + AI workloads

Agent-style workloads amplify the cost: a single LLM-orchestrated task may chain many SQL queries (retrieval, tool metadata, user-context lookup, write-back). The amplification is the same shape as the agent-chain reliability amplification named in the 2026-04-16 AI Platform post ("an agent might chain ten calls together") — 10 × RTT on a wrong-placement Worker is the difference between a sub-second response and a multi-second one. The pgvector-in- Postgres pattern (vector retrieval against the same DB that holds structured data) is specifically called out as a case where edge-to-origin-DB latency matters for AI workloads.

Seen in

Last updated · 200 distilled / 1,178 read