PATTERN Cited by 1 source

RAII-to-explicit-closure for lock visibility¶

Problem¶

Rust's default lock-guard idiom uses RAII — acquire the guard in a local, release implicitly when the scope ends. The lock's hold interval is determined by scope boundaries, which can be surprising:

Temporaries in the same statement (e.g. if let over a read guard) extend the guard's lifetime to the end of the outer expression.
Blocks that declare multiple guards can hold them in interleaved orders that aren't obvious from reading.
Instrumenting how long a critical section was held requires attaching lifecycle hooks to the guard — awkward since it's a short-lived local.
Reviewing a PR to check that a newly added branch doesn't re-acquire the same lock requires tracing the guard's exact drop point through the whole function.

Pattern¶

Replace RAII lock-guard locals with explicit closures that take the critical-section body as an argument:

// Before (RAII):
fn fast_lookup(&self, key: Key) -> Option<Value> {
    let guard = self.catalog.read();     // lock held here
    guard.get(&key).cloned()             // ...through here
}                                        // released here (end of scope)

// After (explicit closure):
fn fast_lookup(&self, key: Key) -> Option<Value> {
    self.catalog.with_read(|cat| {       // lock acquired
        cat.get(&key).cloned()
    })                                   // released the moment the closure returns
}

with_read / with_write / with_write_timeout are lock helpers that acquire the lock, run the closure, and release. The critical section is exactly the closure body — textually visible to the reader.

Benefits¶

Scope is obvious. The closure body is the critical section; nothing outside it holds the lock.
Instrumentable at the helper. Wrap with timing, tracing, labeled logging, current-holder tracking — all without touching the call sites.
Re-entrance bugs shrink. You can't accidentally hold the lock across an else arm (the if-let-lock-scope-bug) because the closure body is the scope.
Pairs naturally with bounded lock timeouts. A with_write_for(Duration, closure) variant produces a Result per call, enabling patterns/lock-timeout-for-contention-telemetry.
Last-holder tracking. The helper can stash std::thread::current().id() + call-site context into a Cell on lock acquisition for post-mortem forensics.

Costs¶

Syntactic overhead: every critical section wraps in a closure call. Rust lifetime elision usually makes this ergonomic, but some patterns (borrowing out of the closure into a longer-lived binding) require more work.
Generic parameter complexity: the helper has to be generic over the closure's return type and any error type involved.
Not useful for long-hold locks that span complex control flow — those need a different pattern (or a redesign of the hold duration).

Canonical instance — Fly.io's fly-proxy Round-2 refactor¶

From the 2025-05-28 parking_lot post:

"Before rolling out a new lazy-loading fly-proxy, we do some refactoring: - our Catalog write locks all time out, so we'll get telemetry and a failure recovery path if that's what's choking the proxy to death, - we eliminate RAII-style lock acquisition from the Catalog code (in RAII style, you grab a lock into a local value, and the lock is released implicitly when the scope that expression occurs in concludes) and replace it with explicit closures, so you can look at the code and see precisely the interval in which the lock is held, and - since we now have code that is very explicit about locking intervals, we instrument it with logging and metrics so we can see what's happening." (Source: sources/2025-05-28-flyio-parking-lot-ffffffffffffffff)

The refactor was primarily undertaken to discriminate pure contention from pure deadlock (see concepts/deadlock-vs-lock-contention), but the visibility win outlasted the bug. Fly lists it as one of the permanent improvements that survived the 2025 incident:

"We audited all our catalog locking, got rid of all the iflets, and stopped relying on RAII lock guards. The resulting closure patterns gave us lock timing metrics, which will be useful dealing with future write contention. All writes are now labeled, and we emit labeled logs on slow writes, augmented with context data (like app IDs) where we have it. We also now track the last and current holder of the write lock, augmented with context information; next time we have a deadlock, we should have all the information we need to identify the actors without gdb stack traces."

Seen in¶

sources/2025-05-28-flyio-parking-lot-ffffffffffffffff — Canonical wiki instance. Refactor done under time pressure to enable contention telemetry; kept permanently because the visibility improvement stood on its own.

systems/fly-proxy — The codebase this pattern was applied to.
systems/parking-lot-rust — The lock library whose try_write_for this pattern pairs with.
concepts/if-let-lock-scope-bug — The bug class this pattern eliminates structurally.
concepts/deadlock-vs-lock-contention — The discrimination problem this pattern supports.
patterns/lock-timeout-for-contention-telemetry — The natural pairing for bounded-wait + instrumented critical sections.
companies/flyio — Fly.io.