Skip to content

SYSTEM Cited by 2 sources

BOLT (Meta Binary Optimizer)

BOLT (Binary Optimization and Layout Tool) is Meta's open-source post-compile binary optimiser that rewrites an already-compiled binary using runtime profile data. Published as "BOLT: A practical binary optimizer for data centers and beyond" (CGO 2019). Part of LLVM: github.com/llvm/llvm-project/tree/main/bolt.

This page covers the Meta-tooling-chain perspective on BOLT — how it fits the Strobelight → FDO → capacity-savings pipeline that runs across Meta's fleet. For the LLVM / upstream-project perspective, see systems/llvm-bolt.

Role on this wiki

  • The post-compile-time consumer in Meta's fleet-scale FDO pipeline. Continuously-collected LBR profiles from the Strobelight LBR profiler are turned into FDO profiles consumed by:
    • Compile timeCSSPGO (Context-Sensitive Sample-based Profile-Guided Optimization) in LLVM.
    • Post-compile timeBOLT, which rewrites binaries in-place.
  • Capacity impact: on Meta's top 200 largest services, the combined pipeline delivers "up to 20% reduction in CPU cycles, which equates to a 10-20% reduction in the number of servers needed to run these services at Meta." This is the economic datum that pays for Strobelight as a platform.
  • Canonical wiki reference for the post-link FDO architectural pattern.

Outside-Meta adoption: the Redpanda 2026-04-02 evaluation

The first wiki-canonical disclosure of a non-Meta team evaluating and rejecting BOLT (Source: sources/2026-04-02-redpanda-supercharging-streaming-with-profile-guided-optimization). Redpanda evaluated BOLT alongside PGO for the 26.1 C++ streaming-broker build pipeline and chose PGO citing stability:

"BOLT's approach to operating on the binary directly avoids an extra compilation step, potentially saving significant build time. This can be especially important for larger projects like Redpanda Streaming. At the same time, its binary-modifying nature is quite brittle, and we ran into a few bugs (like this one)."

"Granted, PGO is a proven and widely deployed technology, so with this in mind and considering some outstanding BOLT bugs, we decided to stick with PGO."

Performance-wise, Redpanda found BOLT "to show improvements similar to PGO. Most of the time, it came in just slightly behind." Combining PGO + BOLT gave "another small bump in performance" — the post preserves the possibility of "adding BOLT on top of PGO at some point."

This is the canonical wiki asymmetry-of-BOLT-adoption datum: at Meta scale the LLVM-expertise headcount absorbs the brittleness risk; at Tier-3-vendor scale the brittleness becomes a deployment-blocker.

How it works (architectural sketch)

  1. Profile collection — sampling (LBR via perf / Strobelight) or instrumentation (BOLT injects instructions into the linked binary directly, without recompilation).
  2. Profile conversion — LBR / perf data converted to BOLT's .fdata via perf2bolt.
  3. Binary rewritellvm-bolt reads the binary + profile, emits an optimised binary with:
  4. Optional heatmap emission — BOLT provides a tool that generates per-12-KiB-block code-access heatmaps. Used by Redpanda 2026-04-02 to visualise PGO's layout impact.

Brittleness and bugs

BOLT's post-link instruction-injection — modifying a linked binary's control-flow graph without compiler-semantic invariants available for verification — is the source of its stability tax. Redpanda hit llvm-project#169899 and noted "some outstanding BOLT bugs" as the decisive factor against BOLT for 26.1.

At Meta, the LLVM-expertise bandwidth to diagnose and upstream fixes is in-house; outside Meta this becomes operational tail-risk that smaller teams can't absorb.

Seen in

Last updated · 542 distilled / 1,571 read