SYSTEM Cited by 1 source

Zipkin Reporter¶

The Zipkin Reporter Java library (zipkin2.reporter.*) is the OpenZipkin client for asynchronously shipping span data to a Zipkin-compatible collector. It's used as the transport layer underneath the higher-level Brave tracer and, transitively, underneath the Spring-world Micrometer Tracing stack.

Key classes¶

BoundedAsyncReporter — non-blocking span reporter with a bounded in-memory queue.
CountBoundedQueue — the queue implementation. Producer-consumer handoff between span-finishing threads (via offer) and the flusher (via drainTo). Implementation uses a single ReentrantLock + Condition — see source.
AsyncReporter.Flusher — background platform thread that calls CountBoundedQueue.drainTo → Condition.awaitNanos in a loop.

The lock the Netflix incident surfaced¶

The CountBoundedQueue's ReentrantLock is the structural contention point in Netflix's 2024-07-29 VT-pinning bug:

Every span finishing via the Brave path calls CountBoundedQueue.offer, which acquires the lock.
The AsyncReporter$Flusher holds the lock while draining, releases it via Condition.awaitNanos, and reacquires it after the wait.
If callers to offer run on virtual threads inside a synchronized block, those VTs get pinned to their carrier threads while blocking on ReentrantLock.lock(). On a 4-vCPU host, 4 such pinned VTs exhaust all carrier threads.

Zipkin Reporter itself is not at fault — the ReentrantLock is a correct, efficient primitive. The pinning is a property of the caller using synchronized on a VT.

Seen in¶

sources/2024-07-29-netflix-java-21-virtual-threads-dude-wheres-my-lock — Netflix Java 21 + Spring Boot 3 microservices: 4 VTs and 1 non-pinned VT + 1 platform-thread AsyncReporter flusher all waiting on the same CountBoundedQueue ReentrantLock. The flusher was the owner, released via awaitNanos, timed out, and the AQS FIFO queue put it behind the pinned VTs. None can run; fleet-wide starvation deadlock.

systems/micrometer-tracing — The upstream Spring observability abstraction that wraps span-finish in synchronized on its Brave bridge path.
systems/spring-boot — The framework stack the Netflix incident runs on.
concepts/virtual-thread-pinning — The failure class that surfaced at this lock.
companies/netflix — Incident adopter.

Zipkin Reporter¶

Key classes¶

The lock the Netflix incident surfaced¶

Seen in¶

Related¶