PATTERN Cited by 1 source

Warehouse-unload bridge¶

Warehouse-unload bridge is the pattern for moving OLAP-computed results back into an OLTP-friendly serving store. OLAP warehouses like Snowflake aggregate billions of rows fast but are not designed for millisecond-latency high-QPS lookups; products that serve those results need an explicit export pipeline, not a direct query path from application code into the warehouse.

Canonical shape (Canva)¶

Snowflake (DBT output)
   ↓ scheduled unload worker (SELECT + export)
S3 (durable intermediary storage)
   ↓ S3 event → SQS (durability + fan-out)
Ingester worker (rate-limited)
   ↓
Service RDS  ← serving tier for low-latency reads

Each hop earns its keep:

S3 as an intermediate: survives ingester restarts, keeps history of exports, decouples warehouse load from serving-tier load. (systems/aws-s3)
SQS for work dispatch: durability + at-least-once delivery, so a crashed ingester resumes instead of dropping data. (systems/aws-sqs)
Rate-limited ingester: the hard part. Warehouses will cheerfully stream more rows per second than your RDS can absorb; Canva specifically reported RDS CPU spikes when throughput was too high, and had to tune the ingest rate to keep CPU and replication happy without stretching the export window too long. (Source: sources/2024-04-29-canva-scaling-to-count-billions)

Why not skip the bridge?¶

Direct warehouse reads from product traffic put a serving workload onto an engine optimised for large scans; latency + cost don't match.
Warehouse connection limits + concurrency costs mean every product service talking directly to the warehouse is a scaling problem of its own.
Isolation. Warehouse reruns (end-to-end recompute; see patterns/end-to-end-recompute) shouldn't block serving reads.

Tuning levers¶

Unload schedule — frequency trades freshness for warehouse compute cost and ingester throughput.
Batch size — bigger S3 objects are cheaper to transfer but create bigger bursts on the ingester.
Ingest rate limit — upper bound that keeps RDS CPU + replica lag in budget. Canva explicitly tuned this after seeing CPU spikes.
Format — columnar / compressed (Parquet) for wire cost; JSON for schema flexibility. Canva extracts DynamoDB JSON into typed columns on ingest.

Observability hot-spots¶

Instrument all four stops:

Warehouse unload: rows unloaded, export duration, warehouse cost.
S3 staging: object counts, size, age (alert on stale objects).
Queue: SQS depth, age of oldest message.
Ingester → RDS: throughput, RDS CPU, replica lag, error rate.

Canva names observe-from-day-1 at every stage as one of the three lessons at the bottom of the post — "it is hard to predict where a problem can happen".

Seen in¶

sources/2024-04-29-canva-scaling-to-count-billions — Canva unloads Snowflake DBT outputs to S3, dispatches via SQS, and rate-limit ingests into service RDS; explicit RDS-CPU-spike tuning story.