Skip to content

CONCEPT Cited by 1 source

ClickHouse append-only tamper resistance

Definition

ClickHouse's MergeTree engine is append-friendly by design — it does not support row-level UPDATE or DELETE in the normal path. Deletion happens only via TTL or explicit expensive mutations. This structural property yields a weak tamper resistance for audit-log / telemetry use cases: silent data modification is not a single-row operation, it requires a visible mutation.

For stronger audit guarantees, the canonical pattern layers cryptographic hashing on inserted rows and stores those hashes in a separate immutable store, then monitors for divergence. The ClickHouse append-only model is the substrate; the hash ledger is the attestation layer.

Verbatim from the 2025-12-09 Redpanda IoT pipeline post:

"ClickHouse allows you to build strong tamper-resistance into your audit log or telemetry pipeline by combining its append-only storage model with external integrity controls. Since ClickHouse doesn't support row-level updates or deletes (except via TTL or mutations), it's naturally resistant to silent data changes. For stronger guarantees, you can implement cryptographic hashing on inserted data, store those hashes in a separate immutable store, and monitor for divergence." (Source: sources/2025-12-09-redpanda-streaming-iot-and-event-data-into-snowflake-and-clickhouse)

Two layers

  1. ClickHouse append-only layer (substrate) — MergeTree's no-row-level-mutation model. An attacker modifying existing data has to issue a schema-visible ALTER TABLE ... UPDATE/DELETE mutation, which shows up in the query log and in part-version metadata. Silent single-row tampering is structurally difficult.
  2. External hash-ledger layer (attestation) — each inserted row (or batch) is hashed (SHA-256 / BLAKE3 / Merkle root) and the hash is written to a separate immutable store (S3 Object Lock, an append-only blockchain, a signed transparency log, even a secondary ClickHouse table with restricted access). A background job periodically re-hashes ClickHouse data and compares against the ledger.

Why one layer isn't enough

  • Append-only alone doesn't defend against an operator who runs a mutation — it just makes the attack visible to anyone looking. Without the external hash ledger, silent tampering is still theoretically possible by root on the host.
  • Hash ledger alone (on a mutable store) lets an attacker with DB access modify both the row and the recorded hash. The ledger must live on storage the DB operator cannot write.
  • Both layers together mean an attacker needs to compromise both the mutable data store and the immutable ledger — a meaningfully harder target.

Composes with

Caveats

  • "Natural resistance" is weak — it's a cost increase for the attacker, not a cryptographic guarantee. A privileged operator can run a mutation and delete the query-log evidence.
  • The hash ledger is the real audit substrate. Without it, ClickHouse's append-only property alone is not compliance-grade.
  • Performance cost of per-row hashing is non-trivial at high ingest rates. Batch-level hashing (one hash per ingested batch) is a common trade-off.
  • Re-hash-and-compare background jobs need careful scheduling — they re-read all audited data and can starve online queries.
  • TTL evictions are not tamper events. The comparator must understand the retention policy and ignore rows that have aged out.

Seen in

Last updated · 470 distilled / 1,213 read