SYSTEM Cited by 1 source
dm-crypt + LUKS2¶
dm-crypt is the Linux kernel's
device-mapper target for full-disk
encryption. LUKS2 (Linux Unified Key Setup, version 2) is the
on-disk format that sits between dm-crypt and the user —
containing the encrypted volume key, key slots, and metadata. The
userland bridge between LUKS2 and dm-crypt is
cryptsetup.
LUKS2 header format reference: LUKS2 on-disk format spec.
Role in Fly.io¶
Fly.io encrypts every Fly Volume with a
per-volume key — "no one worker has a volume skeleton key" —
over dm-crypt + LUKS2.
The LUKS2 header-size problem¶
In Making Machines Move, Fly discloses a nasty heterogeneous-fleet edge case:
Two different workers, for cursed reasons, might be running different versions of cryptsetup, the userland bridge between LUKS2 and the kernel dm-crypt driver. There are (or were) two different versions of cryptsetup on our network, and they default to different LUKS2 header sizes — 4MiB and 16MiB. Implying two different plaintext volume sizes.
A different header size on the target worker means a different
plaintext size on the target's decrypted view, which means
dm-clone creates a clone device of the
wrong size and the migration breaks.
Fix: add a flyd
FSM RPC that carries the source's LUKS2 header metadata to the
target worker, so the target creates the clone with the correct
plaintext shape. "Not something we expected to have to build,
but, whatever."
Canonical concepts/heterogeneous-fleet-config-skew instance.
Seen in¶
- sources/2024-07-30-flyio-making-machines-move — Anchor source; dm-crypt + LUKS2 are the encryption layer that makes migrations harder than they should be.
Related¶
- systems/cryptsetup — The userland bridge whose version skew causes LUKS2 header-size drift.
- systems/fly-volumes — What gets encrypted.
- systems/dm-clone — What runs on top of the decrypted view during migration.
- patterns/fsm-rpc-for-config-metadata-transfer — Fly.io's fix for the header-size skew.