Skip to content

CONCEPT Cited by 1 source

Thread (OS)

Definition

A thread is a unit of execution within a process that shares the process's address space and code segment with peer threads, but maintains its own program counter, register state, and stack. Threads are the OS abstraction for achieving concurrency without paying the full context-switch cost of swapping between processes.

Structural property

All threads within a process share:

  • Address space — heap, code, global/static data.
  • File descriptors — open files, sockets.
  • Process-wide resources — signal handlers, working directory, user/group IDs.

Each thread has its own:

  • Stack — per-thread local variables, call frames.
  • Register state — program counter, stack pointer, general- purpose registers.
  • Thread-local storage (TLS) — optional per-thread globals.

Context switch cost

Because threads share the process's address space, switching between threads does not require swapping the page-table root or flushing the TLB the way switching between processes does. The result: thread switching costs ~1 μs on modern CPUs, roughly 5× faster than a process context switch (~5 μs) (Source: sources/2026-04-21-planetscale-processes-and-threads).

This is the structural reason threaded-application models can sustain more concurrent work than process-per-work models on the same hardware.

Creation

Threads are created via pthread_create() (POSIX) which is a thin wrapper around Linux's unified clone() system call with flags like CLONE_VM (share virtual memory) and CLONE_FILES (share file descriptors) set. The same clone() with different flags creates a process (fork()); the two are the same system call at different points in its flag space.

Trade-offs vs processes

Property Thread Process
Switch cost ~1 μs ~5 μs
Memory overhead per unit Stack (~MB) Address space (~MB + page table)
Isolation None (shared heap) Full (separate address space)
Communication Shared memory (direct) IPC (sockets, pipes, shared memory)
Fault blast radius Whole process Single process

The isolation trade-off is the fundamental tension: threads are cheap but a bug in one corrupts all; processes are expensive but a crash in one doesn't touch the others.

Database architectural role

The thread abstraction is the substrate under thread-per-connection databases like MySQL: "MySQL is a great contrast, designed to run as a single process (mysqld). However, it is also capable of handling thousands of queries per-second, hundreds of connections, and utilizing multi-core CPUs. It achieves this via threads." (Source: [[sources/2026-04-21-planetscale- processes-and-threads]].)

  • Virtual threads (JVM, 2023+) — M:N mapping of many user-mode "virtual threads" onto few OS threads, cheaper than OS threads at the cost of scheduling constraints (pinning under synchronized blocks, etc.).
  • Goroutines, fibers, green threads — user-mode concurrency primitives riding on pools of OS threads.

Seen in

  • sources/2026-04-21-planetscale-processes-and-threads — Ben Dicken's canonical wiki definition of threads via interactive pedagogy. Canonicalises the ~1 μs thread-switch cost, the shared-address-space property, pthread_create + clone substrate, and thread-per-connection as MySQL's architectural choice.
Last updated · 470 distilled / 1,213 read