SYSTEM Cited by 3 sources
Amazon Redshift¶
Amazon Redshift is AWS's managed cloud data warehouse — a columnar, MPP analytics database for petabyte-class OLAP workloads. In the 2010s it was the flagship AWS analytics-engine offering; today it coexists with systems/amazon-athena (serverless SQL over systems/aws-s3) and third-party warehouses like systems/snowflake running on AWS.
Role for this wiki¶
Redshift shows up as one of the compute engines in a compute-storage-separated AWS analytics stack: the storage of record is S3 (concepts/compute-storage-separation), and Redshift is one of several engines that can query it (alongside Athena, Spark on EMR, Hive, AWS Glue, etc).
In Amazon BDT's Spark → Ray migration, Redshift is one of the compute frameworks the Data Reconciliation Service used to verify that real end-user queries against Spark-compacted and Ray-compacted versions of the same table produced equivalent results (alongside systems/apache-spark and systems/amazon-athena).
Seen in¶
- sources/2024-07-29-aws-amazons-exabyte-scale-migration-from-apache-spark-to-ray-on-ec2 — Redshift as one of the compute frameworks in Amazon's post-Oracle BI stack (alongside Athena / Hive / Glue / Spark / Flink); also one of the three query engines used in BDT's Data Reconciliation Service during the Ray migration (Redshift + Spark + Athena).
- sources/2025-05-27-yelp-revenue-automation-series-testing-an-integration-with-third-party-system — Yelp's production substrate for both Revenue Data Pipeline reporting and monthly integrity-check SQL against the billing-system source of truth (99.99% contract match threshold). Publication path to Redshift via the Redshift Connector has ~10-hour latency, which motivates bypassing Redshift entirely for the daily verification loop (see concepts/redshift-connector-latency).
- sources/2026-02-02-yelp-back-testing-engine-ad-budget-allocation — Redshift as the historical-data source for Yelp's Back-Testing Engine: campaign × date grain rows pulled for the simulation date range. A fourth Redshift altitude on the wiki — back-testing / simulation-input substrate — distinct from the Spark→Ray reconciliation and Yelp revenue-reporting roles.
Related¶
- systems/amazon-athena — serverless SQL over S3.
- systems/aws-s3 — the shared storage substrate.
- systems/snowflake — peer cloud warehouse, also compute-storage-separated.
- concepts/oltp-vs-olap — the workload shape Redshift targets.