CONCEPT Cited by 1 source
Heap-dump lock introspection¶
Heap-dump lock introspection is the diagnostic technique of reading the on-heap state of a lock object to determine its current owner and waiter queue — a fallback for situations where the thread dump doesn't (or can't) report lock metadata.
When to use it¶
Use it when:
- The thread dump is exhausted as an evidence source — it shows N threads waiting on some lock, but no thread shows as the owner.
- The tooling has known limitations that drop lock metadata
(Java 21's
jcmd Thread.dump_to_filedoes this — no- lockedlines, noLocked ownable synchronizerslines). - Your working hypothesis is wrong and the evidence contradicts it — you need to inspect the lock object itself, not reason about who should own it.
How it works (Java specifically)¶
Java's ReentrantLock / ReentrantReadWriteLock /
Semaphore / CountDownLatch / most java.util.concurrent
locks all delegate state to AbstractQueuedSynchronizer
(AQS). The AQS object is a regular Java object on the heap,
with fields:
state— int/long that encodes ownership. ForReentrantLock:state > 0means someone holds the lock with countstate;0means free.exclusiveOwnerThread— reference to theThreadthat currently holds the lock, ornull.head/tail— FIFO queue of waiters.firstWaiter/lastWaiter(onConditionObject) — queue of threads parked on aCondition.
All of these are directly readable from a heap dump using Eclipse MAT (or VisualVM, YourKit, jol, etc.).
The technique¶
- Take a heap dump:
jcmd <pid> GC.heap_dump /tmp/heap.hprof(or jmap). Pair with thejcmdthread dump so you can cross-reference. - Open the heap in Eclipse MAT.
- Identify the lock object. Easiest path: find a known
waiter thread, walk its stack-local references back to
the
ReentrantLockinstance. (Netflix did exactly this via theAsyncReporterthread's stack.) - Inspect the AQS state fields (
state,exclusiveOwnerThread, waiter queue). - Cross-reference thread IDs in the waiter queue with the thread-dump stack traces to reconstruct the full picture.
The Netflix 2024-07-29 application¶
"Finding the lock in the heap dump was relatively straightforward. Using the excellent Eclipse MAT tool, we examined the objects on the stack of the
AsyncReporternon-virtual thread to identify the lock object. Reasoning about the current state of the lock was perhaps the trickiest part of our investigation. Most of the relevant code can be found in theAbstractQueuedSynchronizer.java. While we don't claim to fully understand the inner workings of it, we reverse-engineered enough of it to match against what we see in the heap dump." (Source: sources/2024-07-29-netflix-java-21-virtual-threads-dude-wheres-my-lock)
Netflix confirmed from the heap:
- exclusiveOwnerThread == null — no current owner.
- The waiter queue contained 6 threads (4 pinned VTs + 1
non-pinned VT + 1 platform AsyncReporter flusher).
- The Condition's internal queue showed the flusher had
released via awaitNanos and been requeued for the lock.
This evidence together forced the conclusion: the flusher is
the recent owner, released via awaitNanos, and the FIFO
write-preference queue in AQS placed it behind the pinned
VTs — making the AQS queue the de-facto structural lock on
forward progress.
Related technique: core-dump lock introspection¶
Similar in spirit for Rust / C++ / Go: take a core dump,
inspect the lock object's bytes in gdb / dlv / LLDB.
See sources/2025-05-28-flyio-parking-lot-ffffffffffffffff
where Fly.io used exactly this on parking_lot's 64-bit
lock word to identify a
bitwise double-free bug.
Seen in¶
- sources/2024-07-29-netflix-java-21-virtual-threads-dude-wheres-my-lock
— Canonical wiki instance. Netflix diagnosed a
VT-pinning-caused starvation deadlock by reverse-engineering
AQS state from a heap dump after
jcmd-generated thread dumps omitted lock metadata. Use of Eclipse MAT + AQS source reading.
Related¶
- concepts/jcmd-thread-dump — The primary tool that, when insufficient, forces a fallback to this technique.
- concepts/virtual-thread-pinning — The bug class that surfaced the need for this technique at Netflix.
- patterns/diagnose-via-heap-dump-lock-introspection — The operational pattern.
- companies/netflix — Canonical Java application instance.