Skip to content

SYSTEM Cited by 1 source

NBD (Network Block Device)

NBD is the Linux kernel's native network block-device protocol — a simpler, younger alternative to iSCSI. The Linux kernel includes a built-in NBD client, and writing an NBD server is notoriously easy: you can do it "in an afternoon, on top of a file or a SQLite database or S3, and the Linux kernel could mount it as a drive" (Fly.io, 2024-07-30).

Role in Fly.io migrations (tried, then abandoned)

In Making Machines Move, Fly.io initially adopted NBD as the network block protocol for serving source Volumes to target workers during a clone-based migration. They switched to iSCSI after repeated production issues:

But we kept getting stuck nbd kernel threads when there was any kind of network disruption. We're a global public cloud; network disruption happens. Honestly, we could have debugged our way through this. But it was simpler just to spike out an iSCSI implementation, observe that didn't get jammed up when the network hiccuped, and move on.

The explicit frame — "we could have debugged this, but it was simpler just to switch" — is worth noting. The usual upstream-the-fix playbook says fix the dependency and upstream the patch; Fly chose the dual — switch to a more-robust alternative and let someone else fix NBD.

Seen in

Caveats

  • The post doesn't detail how the NBD kernel threads got stuck — whether it was a protocol-state-machine issue, a timeout configuration problem, a specific bug in the kernel client, or something else.
  • This isn't a blanket indictment of NBD. Other production users (qemu, distributed filesystems) use NBD successfully. Fly's case is specifically a globally distributed deployment where network hiccups happen constantly at ~planetary scale.
Last updated · 200 distilled / 1,178 read