Skip to content

PLANETSCALE Tier 3

Read original ↗

PlanetScale — Processes and Threads

Summary

Ben Dicken (PlanetScale) pedagogical interactive deep-dive on the process and thread abstractions — the most fundamental OS primitives for multitasking on a single CPU core. The post walks from CPU + RAM hardware → single-program-at-a-time early computing → multi-process time-slicing (the innovation that enabled multitasking on one core) → process state machine (running / ready / waiting / killed) → fork() / execve() / clone() system calls → threads as within-process execution contexts sharing memory → Postgres's process-per-connection vs MySQL's thread-per-connection architecture as the canonical database worked example → connection pooling as the funnel mitigating the heavy-connection cost at either shape. Architecturally dense despite pedagogical framing: first canonical wiki treatment of the process / thread / context-switch vocabulary that every serving-infra post on the wiki depends on as foundational knowledge.

Key takeaways

  1. A process is the fundamental OS abstraction for a running program. "A process is an instance of a program being executed by a computer. The job of an operating system is to manage many processes at once and allow the CPU to switch between them." (Source: sources/2025-09-24-planetscale-processes-and-threads.) Canonicalised as concepts/process-os-abstraction.

  2. Multitasking on a single core works via time-slicing. The CPU divides its execution budget into short time-slices (measured in milliseconds on real hardware), giving each process a burst of time to make progress. Modern CPUs execute >1 billion instructions per second, so even a 1ms time-slice is >1 million instructions — enough per-slice progress that humans perceive concurrent execution. Canonicalised as concepts/time-slicing-multitasking.

  3. Context switches cost ~5 microseconds of pure OS overhead. "The full time of a context switch takes ~5 microseconds on modern CPUs… it requires executing tens of thousands of instructions, and this happens hundreds of times per second. A CPU typically executes several billion instructions in a second, but managing and switching between processes can consume tens of millions of these." (Source: sources/2025-09-24-planetscale-processes-and-threads.) This is the load-bearing cost that makes process-per-connection architectures (like Postgres) heavier than thread-per-connection ones (like MySQL). Canonicalised as concepts/context-switch.

  4. A process transitions through four states. running (on the CPU), ready (kicked off because its time-slice ended, waiting for the scheduler), waiting (blocked on I/O for a disk / network request), killed (completed). The state machine is OS-assigned, not process-controlled. Canonicalised as concepts/process-state-machine.

  5. fork() and execve() are how programs spawn processes. "Calling fork() makes a process create an exact clone of itself as a new process. The cloned process gets immediately placed into the ready state." execve() replaces the current process's program with a new one loaded from disk. "When a computer boots up, a single process is initiated and all others are descendants of this one." The fork-then-exec idiom is the canonical Unix shape for spawning an arbitrary new program. Canonicalised as concepts/fork-system-call + concepts/execve-system-call.

  6. Threads share memory, context-switch ~5× faster. A thread is a within-process execution context. All threads in a process share the process's memory and code (only their program stacks differ), but each executes at a different program location and can be scheduled independently. "Switching between threads is around 5x faster, taking closer to 1 microsecond to complete." Canonicalised as concepts/thread-os-abstraction.

  7. Both fork() and pthread_create() are wrappers around clone(). On modern Linux, the one underlying syscall is clone(); flags like CLONE_VM (share virtual memory) and CLONE_FILES (share file descriptors) distinguish process-creation from thread-creation semantics. Canonicalised as concepts/clone-system-call.

  8. Postgres uses process-per-connection; MySQL uses thread-per-connection. "Postgres is implemented with a process-per-connection architecture. Each time a client makes a connection, a new Postgres process is created on the server's operating system. There is a single 'main' process (PostMaster) that manages Postgres operations, and all new connections create a new Process that coordinates with PostMaster." MySQL contrastingly runs as a single mysqld process that spawns a thread per connection — "this gives MySQL some advantages in terms of performance in some scenarios." Canonicalised as patterns/process-per-connection + patterns/thread-per-connection.

  9. Connection pooling is the mitigation for both shapes. "MySQL, Postgres, and many other databases use a technique known as connection pooling to help… Connection poolers sit between clients and the database. All connections from the client are made to the pooler, which is designed to be able to handle thousands at a time. It maintains its own pool of direct connections to the database, typically between 5 and 50." The pooler acts as a funnel — pushing queries from thousands of client connections into tens of database connections. Canonicalised as concepts/connection-pooling

  10. patterns/connection-pool-funnel. Referenced pooler: PgBouncer.

Systems / concepts / patterns extracted

Systems

  • systems/postgresql — canonical process-per-connection database architecture; each client connection spawns a new OS process that coordinates with the PostMaster supervisor.
  • systems/mysql — canonical thread-per-connection database architecture; mysqld is a single process spawning one thread per connection, utilising multi-core via threads.
  • systems/pgbouncer — connection pooler referenced by name as the funnel pattern between thousands of client connections and tens of database connections.
  • systems/innodb — MySQL's storage engine referenced in the context of MySQL's threading model.

Concepts

  • concepts/process-os-abstraction"a process is an instance of a program being executed by a computer"; the fundamental OS abstraction over CPU + RAM.
  • concepts/thread-os-abstraction — within-process execution context sharing memory with sibling threads, scheduled independently; mount/unmount ~5× faster than process context switch.
  • concepts/context-switch — ~5μs on modern CPUs, tens of thousands of instructions per switch; the load-bearing cost that makes time-slicing "not free".
  • concepts/process-state-machine — running / ready / waiting / killed; OS-assigned, transitions driven by time-slice exhaustion (→ ready) or I/O block (→ waiting) or completion (→ killed).
  • concepts/fork-system-call — creates an exact clone of the calling process as a new process in the ready state; new process's parent is the caller.
  • concepts/execve-system-call — replaces the current process's program with a new one loaded from disk; the fork-then-exec idiom is the canonical Unix spawn shape.
  • concepts/clone-system-call — the Linux-specific underlying syscall that both fork() and pthread_create() wrap; CLONE_VM / CLONE_FILES flags select process-creation vs thread-creation semantics.
  • concepts/time-slicing-multitasking — the OS divides CPU time into short slices, giving each process a burst of execution; the innovation that enabled single-core multitasking.
  • concepts/connection-pooling — pooler maintains N (typically 5–50) direct connections to the database and funnels queries from thousands of client connections through them.

Patterns

  • patterns/process-per-connection — Postgres's architecture; OS isolation and supervision via PostMaster, paid for in memory + context-switch overhead per connection.
  • patterns/thread-per-connection — MySQL's architecture; lower per-connection overhead via shared process memory + thread-faster context switch, at the cost of losing OS-level isolation between connections.
  • patterns/connection-pool-funnel — generalised pattern: external pooler sits between many client connections and few database connections; each incoming query is multiplexed onto the limited direct-pool connections.

Operational numbers

  • Context switch cost: ~5 μs on modern CPUs.
  • Instructions per context switch: tens of thousands.
  • Thread switch cost: ~1 μs (~5× faster than process context switch).
  • Typical DB connection-pool size: 5–50 direct connections.
  • CPU instructions per second: >1 billion.
  • Time-slice granularity: milliseconds.

Caveats

  • Pedagogical post, no production retrospective numbers. Post is explicitly an "interactive article" with a made-up simplified instruction set for the visualisations. No PlanetScale fleet telemetry disclosed (no real Postgres-vs-MySQL production context-switch overhead, no connection-pool sizing data from the PlanetScale fleet).
  • Simplified context-switch cost: the "~5 microseconds" figure is presented as a single number; in practice it varies significantly with CPU architecture, cache state, TLB flushes, NUMA effects, and whether the switch is within or across cores / NUMA nodes. The post calls out this simplification.
  • Virtual memory deferred: the post explicitly notes that context-switch visualisations depict RAM being "copied" on switch, which is not how OSes actually work — real OSes use virtual memory (not yet canonicalised on this wiki) to avoid the copy. The post defers VM coverage to a future piece.
  • Connection-pooling coverage is single-paragraph. Post names the pattern and its shape but doesn't distinguish transaction / session / statement pooling modes, doesn't cover pooler HA, and doesn't quantify pooler overhead. PgBouncer is named only as an example vehicle.
  • clone() flag enumeration is deferred: "we wont get into all these details here, but run man clone on a linux machine for the details."
  • Interactive-only details: the post's CPU simulator uses made-up instructions (e.g. FORK, EXEC, PTCREATE) that capture the shape of the real instruction sets without claiming ISA-level accuracy; x86-64 / ARM64 are referenced but not covered.

Cross-source continuity

  • First canonical wiki treatment of the process / thread vocabulary. Prior wiki had concepts/virtual-thread and concepts/carrier-thread (JVM-specific Java-21 VTs, 2024-07-29 Netflix), concepts/fork-join-pool (JVM-specific scheduler), concepts/pre-fork-copy-on-write (Linux fork mechanics specifically in the Python / gunicorn preload context, 2025-12-15 Lyft), concepts/signal-handler-fork-inheritance (POSIX signal rules on fork). None of these canonicalised the underlying process / thread / context-switch / fork / execve / clone syscall primitives they all depend on. This post fills that gap at the OS-fundamentals level.
  • Complements sources/2025-07-08-planetscale-caching — the caching post is a pedagogical deep-dive on one foundational hierarchy (CPU cache / RAM / disk / CDN); this post is the companion on the other foundational abstraction (CPU as multitasking substrate). Both from Ben Dicken, same pedagogical format (interactive visualisations), same foundational-vocabulary role on the wiki.
  • Provides the architectural backdrop for Postgres's connection-scaling ceiling. The process-per-connection shape + ~5μs context-switch cost is exactly why Postgres operators reach for PgBouncer / PlanetScale for Postgres's proprietary proxy (which incorporates PgBouncer) before they exhaust connection slots.
  • Complements concepts/pre-fork-copy-on-write — pre-fork COW is the specific memory-efficiency exploit that makes process-per-connection (and other fork-heavy architectures) viable on Linux; this post canonicalises the underlying fork() syscall that COW optimises.
  • Kinship with concepts/virtual-thread — Java 21's VTs and OS threads are two different layers of the multitasking hierarchy: OS threads (this post) multiplex onto CPU cores; VTs multiplex onto OS threads via the JVM's continuation machinery.
  • No existing-claim contradictions — strictly additive foundational vocabulary.

Source

Last updated · 319 distilled / 1,201 read