Skip to content

SYSTEM Cited by 2 sources

LLVM BOLT

LLVM BOLT (Binary Optimization and Layout Tool) is the upstream, LLVM-project-maintained fork of the post-link binary optimiser originally published by Meta at CGO 2019. The tool lives at github.com/llvm/llvm-project/tree/main/bolt.

This page covers the LLVM-project perspective on BOLT — upstream availability, outside-Meta adoption patterns, and the brittleness caveats encountered by non-Meta users. For the Meta-tooling-chain perspective (Meta fleet, Strobelight integration, CSSPGO composition), see systems/meta-bolt-binary-optimizer.

Role on this wiki

  • Canonical consumer of post-link FDO profiles — BOLT rewrites already-compiled, already-linked binaries using profile data without requiring recompilation.
  • Architectural counterpart to PGO — the same optimisation family (code layout, hot-cold splitting), different position in the build pipeline.
  • Open-source reference implementation of the post-link FDO pattern — other tools (Google's Propeller, Microsoft's Propeller-adjacent experiments) build on similar principles.

Outside-Meta adoption

Redpanda's 2026-04-02 disclosure (Source: sources/2026-04-02-redpanda-supercharging-streaming-with-profile-guided-optimization) is the canonical wiki non-Meta BOLT adoption attempt:

  • Evaluated for Redpanda Streaming 26.1.
  • Rejected in favour of PGO citing stability: "its binary-modifying nature is quite brittle, and we ran into a few bugs (like this one)."
  • Performance-wise: "BOLT to show improvements similar to PGO. Most of the time, it came in just slightly behind."
  • Noted BOLT is complementary, not replacement: "Many combine PGO and BOLT for the best performance, and we've seen this during our own tests. (We'll likely return to adding BOLT on top of PGO at some point.)"

Key takeaway: BOLT wins are real but the engineering tail-risk at non-Meta scale is material. Teams without LLVM expertise in-house should start with PGO.

What BOLT provides

As the LLVM reference, BOLT implements:

  • Function reordering — hot functions co-located in the binary, cold functions pushed to the end.
  • Basic-block reordering — hot fall-through paths laid out contiguously.
  • Hot-cold function splitting — rarely-executed code moved to .text.cold.
  • Indirect call promotion — hot indirect-call sites converted to speculative direct calls with runtime guards.
  • Binary heatmap visualisation tool — emits a per-12-KiB code-access heatmap that visualises hot-code packing density before / after optimisation. Used by Redpanda 2026-04-02 to illustrate PGO's layout impact.

Profile-collection modes

  • Sampling mode — uses Linux perf LBR (Last Branch Record) data collected from unchanged production binaries. Zero baseline overhead; statistical coverage. Canonical fleet-scale shape at Meta via Strobelight.
  • Instrumented mode — BOLT injects instrumentation into the already-linked binary (no recompilation). Deterministic coverage but injection is the source of much of BOLT's brittleness.

See concepts/instrumented-vs-sampling-profile.

Wire format

BOLT consumes its own profile format (.fdata) but can convert from:

  • Linux perf data via perf2bolt.
  • LLVM's .profdata sampling-format via tooling bridges.

Availability

  • Upstream LLVM since LLVM 16 (2023); bundled with standard LLVM releases.
  • Linux x86_64 support first; ARM64 and others lag.
  • Binaries built with standard Clang / GCC are compatible (no special compile flags required — BOLT works post-link).

Seen in

Last updated · 470 distilled / 1,213 read