Skip to content

SYSTEM Cited by 2 sources

BoltDB

BoltDB (github.com/boltdb/bolt) is an embedded key-value store for Go, modelled after LMDB: a single B+tree in an mmap'd file, single-writer / many-reader, serializable transactions, no SQL. Minimal API surface — buckets, keys, values — and no query planner.

Stub page — expand on future BoltDB-focused sources.

Role in the wiki

flyd state store at Fly.io

Fly.io's orchestrator flyd uses BoltDB to persist the steps of every in-flight FSM (start, create, cordon, migrate, …). See the 2024-07-30 Making Machines Move post for the architectural sketch: "flyd is a server for on-demand instances of durable finite state machines … with the transition steps recorded carefully in a BoltDB database."

JP Phillips's defence of Bolt-over-SQLite (2025-02-12)

From the exit interview, the engineer who built flyd restates the storage choice as deliberate and stands by it three years later:

Was I right that we should have used SQLite for flyd, or were you wrong to have used BoltDB?

I still believe Bolt was the right choice. I've never lost a second of sleep worried that someone is about to run a SQL update statement on a host, or across the whole fleet, and then mangled all our state data. And limiting the storage interface, by not using SQL, kept flyd's scope managed.

On the engine side of the platform, which is what flyd is, I still believe SQL is too powerful for what flyd does. (Source: sources/2025-02-12-flyio-the-exit-interview-jp-phillips)

The argument generalises: pick your state-store's query surface by your blast radius for an ad-hoc query, not by feature breadth. A full-SQL store like SQLite invites ad-hoc UPDATE statements that can rewrite fleet state in one command; a bucket/key/value store forces you through explicit code paths to mutate anything.

This is the inverse of the reasoning that picked SQLite for corrosion2 — corrosion wants generic queryability because its consumers want SQL on the read side. flyd's store is an engine store; a query surface is a liability.

The per-Fly-Machine SQLite counterfactual

In the same interview, JP floats an alternate design — one SQLite per Fly Machine — where the blast radius is one Machine instead of the fleet. He'd consider that trade, but names schema management across thousands of per-instance databases as the unsolved problem.

Seen in

Caveats / open questions

  • Post-2018, the original BoltDB repo is archived; most production users are on forks (etcd's bbolt). This source does not disclose which Go module Fly.io links against.
  • No fleet-level disclosure of BoltDB file size, write volume, or fsync cadence from flyd.
Last updated · 200 distilled / 1,178 read