SYSTEM Cited by 1 source
Amazon Redshift¶
Amazon Redshift is AWS's managed cloud data warehouse — a columnar, MPP analytics database for petabyte-class OLAP workloads. In the 2010s it was the flagship AWS analytics-engine offering; today it coexists with systems/amazon-athena (serverless SQL over systems/aws-s3) and third-party warehouses like systems/snowflake running on AWS.
Role for this wiki¶
Redshift shows up as one of the compute engines in a compute-storage-separated AWS analytics stack: the storage of record is S3 (concepts/compute-storage-separation), and Redshift is one of several engines that can query it (alongside Athena, Spark on EMR, Hive, AWS Glue, etc).
In Amazon BDT's Spark → Ray migration, Redshift is one of the compute frameworks the Data Reconciliation Service used to verify that real end-user queries against Spark-compacted and Ray-compacted versions of the same table produced equivalent results (alongside systems/apache-spark and systems/amazon-athena).
Seen in¶
- sources/2024-07-29-aws-amazons-exabyte-scale-migration-from-apache-spark-to-ray-on-ec2 — Redshift as one of the compute frameworks in Amazon's post-Oracle BI stack (alongside Athena / Hive / Glue / Spark / Flink); also one of the three query engines used in BDT's Data Reconciliation Service during the Ray migration (Redshift + Spark + Athena).
Related¶
- systems/amazon-athena — serverless SQL over S3.
- systems/aws-s3 — the shared storage substrate.
- systems/snowflake — peer cloud warehouse, also compute-storage-separated.
- concepts/oltp-vs-olap — the workload shape Redshift targets.