Skip to content

PATTERN Cited by 1 source

Blocking model per request (Tomcat)

Shape

One thread is dedicated to handling one HTTP request for that request's entire lifetime. When the handler blocks on a downstream call (JDBC, HTTP client, queue offer), the thread blocks; when the handler returns, the thread is released back to the pool.

This is the classical servlet-container + blocking-I/O pattern:

  • Simple mental model (read → process → write → done).
  • Trivially composes with blocking Java libraries (JDBC, most HTTP clients, most logging / tracing shims).
  • Debuggable with standard thread dumps.

Classical vs. virtual-thread-backed

Classical (platform-thread-per-request) concurrency ceiling: the thread-pool size. Tomcat default is a few hundred.

Virtual-thread-backed (Tomcat 10.1+ via VirtualThreadExecutor): conceptually the same shape, but the thread handling each request is a virtual thread. VTs multiplex onto carrier OS threads — see concepts/carrier-thread — so the concurrency ceiling is "how many VTs can be kept alive waiting on I/O without thrashing the heap", potentially millions.

Crucially the programming model doesn't change. The same doGet(req, resp) + blocking downstream call works as-is. Netflix migrates by flipping the executor config flag.

The hazard this pattern inherits from VTs

The pattern was designed for platform threads, which always release the OS thread on any block. Virtual threads mostly do the same — except under pinning. So a pattern that "just works" on platform threads can silently fail on VTs if any path through the request touches a blocking call inside a synchronized block.

Netflix's 2024-07-29 incident is the canonical wiki instance: Tomcat on embedded VT executor, request-path code transits Micrometer Tracing's synchronized-wrapped span-end, which calls Zipkin Reporter's ReentrantLock.lock() — all four request-handler VTs pin their carriers, Tomcat's accept loop keeps adding to the unmountable queue, closeWait piles up.

Alternatives

  • Non-blocking (reactive): Spring WebFlux / Reactor / Project Reactor — explicit non-blocking composition via Mono / Flux. Avoids pinning by construction (no synchronized-wrapped blocking) but imposes a different programming model.
  • Coroutines (Kotlin): structured concurrency via suspending functions. Similar VT-style multiplexing but with different library ecosystem.

The blocking-per-request pattern with VTs remains attractive because it keeps the simple model and the existing library ecosystem — as long as pinning is audited out.

Seen in

Last updated · 319 distilled / 1,201 read