SYSTEM Cited by 1 source
dm-clone (Linux Device Mapper)¶
dm-clone is a Linux kernel
device-mapper target that creates
a block-level asynchronous clone of a source block device.
Given a readable source device, it presents a new device of
identical size where:
- Reads of uninitialised (un-hydrated) blocks fall through to the source device.
- Reads of hydrated blocks are served locally from the clone.
- Writes go only to the clone.
- A background
kcopydthread rehydrates blocks from source to clone independently of user I/O.
State is tracked in a separate metadata device (a bitmap of
"is this block hydrated yet?"). Upstream docs:
Documentation/admin-guide/device-mapper/dm-clone.rst.
Map function (sketch)¶
Paraphrasing the kernel source the 2024-07-30 Fly.io post quotes:
region_nr = bio_to_region(clone, bio);
if (dm_clone_is_region_hydrated(clone->cmd, region_nr)) {
// We have the block locally.
remap_and_issue(clone, bio);
return 0;
} else if (bio_data_dir(bio) == READ) {
// Read miss; fall through to the source.
remap_to_source(clone, bio);
return 1;
}
// Write miss; write to the clone, kick hydration.
remap_to_dest(clone, bio);
hydrate_bio_region(clone, bio);
return 0;
Role in Fly.io migrations¶
In the Making
Machines Move post, dm-clone is the kernel-side half of Fly's
kill → clone
→ boot migration. The source device is the origin Volume on
the draining worker, mounted on the target worker over a network
block protocol (iSCSI, having tried
NBD first). The clone device is a fresh local
volume on the target worker. A new Fly
Machine boots with the clone device attached; reads fall through
to the source over the network until kcopyd has rehydrated the
relevant blocks.
"kill, clone, boot is fast; it can be made asymptotically as
fast as stateless migration."
TRIM / DISCARD short-circuit¶
Fly Volumes are typically very sparse. dm-clone supports
short-circuiting the hydration of unused blocks via DISCARD:
"A DISCARD issued on the clone device will get picked up by
dm-clone, which will simply
short-circuit the read
of the relevant blocks by marking them as hydrated in the metadata
volume."
To drive this: the target worker decrypts the source volume
(requires Fly to work out the LUKS2 plaintext shape — see
systems/dm-crypt-luks2), mounts the filesystem, runs fstrim,
and the filesystem-issued DISCARDs propagate through the
device-mapper stack to dm-clone, which marks those blocks
hydrated without pulling them over the network. Canonical
concepts/trim-discard-integration instance.
Seen in¶
- sources/2024-07-30-flyio-making-machines-move —
dm-clonepowers Fly's fleet-drain migration for stateful Machines. The canonical wiki source fordm-clone.
Caveats¶
dm-cloneneeds a correctly-sized metadata device sized for the clone volume's block count; the post does not cover sizing policy.- Hydration rate and write-performance characteristics on a partially-hydrated clone are not published by Fly in this post.
- The post doesn't discuss hydration-policy tuning (throughput pacing, per-region concurrent hydrations, priority during user I/O bursts).
Related¶
- systems/linux-device-mapper — Parent system.
- systems/iscsi / systems/nbd — Network block protocols that expose the "source device" remotely.
- concepts/block-level-async-clone — The general architectural
pattern
dm-cloneinstantiates at the kernel tier. - concepts/async-clone-hydration — Shape-parallel at the Git/repo layer (Cloudflare Artifacts).
- patterns/async-block-clone-for-stateful-migration — Fly.io's
end-to-end migration recipe built on
dm-clone.