Skip to content

PATTERN Cited by 1 source

Durable chain of custody

Pattern

To defend against data corruption between the customer and the storage service — network flips, buggy middleboxes, memory errors on intermediate hops — compute an integrity checksum at the earliest possible point (inside the client SDK, over the bytes the application hands in) and propagate it alongside the request through the entire pipeline. Every hop verifies; the storage layer only accepts data that matches the checksum end-to-end.

The "chain of custody" framing — borrowed from forensic evidence — emphasizes that the bytes must be continuously accountable from the customer's process memory to the durable on-disk form. Any break in the chain is a silent-corruption hole.

The S3 instantiation

From Kozlovski's 2024 explainer (Source: sources/2024-03-06-highscalability-behind-aws-s3s-massive-scale):

"S3 has implemented what they call a durable chain of custody. To solve the edge case where data can become corrupted before it reaches S3, AWS implemented a checksum in the SDK that's added as an HTTP Trailer (preventing the need of scanning the data twice) to the request."

Three design choices are worth calling out:

  1. Checksum at the SDK. As soon as the customer's PUT call hands bytes to the AWS SDK, the SDK computes the checksum over those bytes. Upstream corruption between the application and the SDK is out of scope (it's the customer's memory). Anything after is in scope.
  2. HTTP Trailer, not Header. Trailers are sent after the body — so the client can stream the body while computing the checksum, emitting the checksum at the end. Using a Header would force two passes over the data: one to checksum, one to send. Trailer = one pass + one network trip = cheaper for large uploads.
  3. Verify at every hop, reject on mismatch. Every component on the path to durable storage re-verifies. Any mismatch halts the write. The checksum propagates down to the on-disk layer, where it's written durably alongside the data.

Why this is different from "add a checksum"

Most systems have checksums somewhere (TCP, TLS, filesystem, application). The chain-of-custody discipline is that the checksum covers every hop, and that there's no gap where an unchecksummed copy of the data lives. Specifically:

  • TCP checksum is 16 bits and not cryptographic — it misses many bit flips.
  • TLS covers the wire but not the proxy → origin internal link.
  • An application-level checksum written by the storage service over the received bytes can't detect corruption that happened before receipt.

The pattern's insight: compute the checksum at the earliest trust boundary (the SDK), and carry it all the way through. Any middlebox that modifies the bytes mid-flight produces a mismatch at the storage layer.

Where the gap closes

Gap Traditional coverage Chain-of-custody coverage
Network bit flip (TCP miss) Rare but real Detected
TLS-termination proxy corrupts Not covered Detected
Load-balancer memory error Not covered Detected
Storage-server memory before disk write Application-level checksum Detected
Bit flip on disk post-write Filesystem / disk checksum Detected (separate mechanism)
Customer-side memory error before SDK Not covered Not covered

The last row is the remaining gap. AWS SDKs that run in customer-chosen environments can't defend against corrupt customer memory — they can only start the chain as early as possible.

Generalization

The pattern applies to any system where bytes cross a trust boundary and must be durable on arrival:

  • Writes to object storage (S3's case) — SDK checksum → HTTP Trailer → storage verification.
  • Cross-region replication — source checksums each object once, destination verifies on arrival.
  • Event ingestion pipelines — producer computes per-event checksum, consumer verifies (Kafka optional CRC, Kinesis signatures).
  • Upload pipelines for training data / logs — source-side hash accompanies the upload, pipeline rejects mismatches.
  • AI training pipelines — checksum on dataset shards, model training aborts if any shard fails verification.

Pairs with

  • patterns/durability-review — the durable chain of custody is the kind of coarse-grained guardrail (single mechanism that defeats a whole class of risks) that a durability review tends to produce. See concepts/threat-modeling.
  • End-to-end principle (Saltzer, Reed, Clark 1984) — the classical networking argument that reliability-critical checks belong at the endpoints, not in intermediate layers. Durable chain of custody is the storage-specific application.

Caveats

  • Checksum algorithm matters. CRC32C is fast and catches most hardware flips; cryptographic hashes (SHA-256) defend against adversarial tampering but cost CPU. S3 supports multiple; customers choose by workload.
  • The pattern is about data-at-rest integrity, not authenticity. A chain of custody detects bit flips but doesn't prove who sent the data — that's AWS Signature V4's job.
  • Customer SDK must actually opt in to Trailer mode on older SDKs. Default behavior has evolved over time.
  • Intermediate components that modify the body (e.g., a caching proxy that re-chunks or decompresses) break the chain. The pattern requires passive forwarding.

Seen in

Last updated · 319 distilled / 1,201 read