Skip to content

CONCEPT Cited by 1 source

Soft-delete vs hard-delete

Definition

A canonical application-design trade-off for handling row deletion in production databases:

  • Soft-delete — the row is marked as deleted (typically via deleted_at timestamp or is_deleted boolean) but remains physically present in the table. Application queries filter out soft-deleted rows.
  • Hard-delete — the row is physically removed via DELETE FROM ... at SQL level. Once committed, it's gone.

The trade-off

Dimension Soft-delete Hard-delete
Recoverable on app bug / user mistake Yes (flip the flag) Only via backup
Storage cost Grows unboundedly Reclaimed
Query complexity Every query filters WHERE deleted_at IS NULL Clean
Index cost Indexes must include deleted rows Normal
GDPR / data-deletion compliance Problematic (row is still there) Clean
Unique-constraint re-insert Blocked by the soft-deleted row Allowed

Backup as the hard-delete escape hatch

A canonical production insight from the 2024-07-30 PlanetScale post (Source: sources/2026-04-21-planetscale-faster-backups-with-sharding): backups provide a hard-delete escape hatch — if backups give you PITR or time-travel restore, you can use hard-delete at the application layer and still recover from mistakes.

The customer case study (Dub):

"A customer on their platform accidentally deleted a bunch of information from their account, which in turn dropped many rows from their PlanetScale database. These changes go to the primary and are propagated to both replicas. The application also did not have a 'soft-delete' feature, meaning that the data was really deleted, rather than just hidden. However, this data still existed in one of the recent backups, and thus was able to be restored."

Structural framing: if you have fast, complete, easily-restorable backups, you don't need soft-delete's recoverability affordance. Backups are the fallback.

This shifts the soft-delete decision from "recoverability" to "query ergonomics + unique-constraint semantics + compliance" — considerations that each have their own answer, often pointing the other way.

When soft-delete still makes sense

  • Compliance rules require it (some audit / legal scenarios require proof-of-deletion-intent distinct from physical absence).
  • The application surfaces "recently deleted" to users (Gmail Trash, file-system Recycle Bin).
  • Query patterns routinely need both live and deleted rows (analytics, reports).
  • Backups are too slow / expensive to be the recovery primitive (unsharded multi-TB database with 63-h restore time).

When hard-delete + backup is the better shape

  • Fast shard-parallel backups and fast restores. PlanetScale's 32-shard 20 TB / 2 h restore makes backup-as-recovery operationally viable.
  • GDPR / data-deletion laws that require physical removal.
  • Performance-sensitive workloads where the soft-delete WHERE deleted_at IS NULL filter bloats indexes.
  • Simpler app code. Hard-delete is the default SQL semantic; soft-delete is a layer applications bolt on.

Caveats

  • Backup recovery requires operator intervention — a user-deleted row returning "within minutes" is not what backup-restore provides. If the product UX needs undo, soft-delete may still be the right tool at the app layer.
  • Restore is a cluster-scale operation — restoring a backup to recover a single row means operating at the cluster / branch altitude. PlanetScale supports this via branch-and-cherry-pick UX (restore backup to a dev branch, pick rows back over), but this is an engineering operation, not a user-triggered one.
  • Backups have latency. If a row is deleted and you want to recover it, the recovery window is bounded by backup cadence. PlanetScale's 12-hour cadence means worst-case 12 hours of writes between the target time and the latest backup must be replayed.

Seen in

Last updated · 347 distilled / 1,201 read