Skip to content

SYSTEM Cited by 2 sources

Amazon EMR

Amazon EMR (Elastic MapReduce) is AWS's managed Hadoop-ecosystem runtime — hosts for systems/apache-spark, systems/apache-hive, Presto, Flink, HBase, and other OSS big-data engines on systems/aws-ec2 (and more recently on EKS and serverless). It is the canonical "big data cluster as a service" on AWS and the substrate behind much of the post-Hadoop data-lake workload on systems/aws-s3.

Role for this wiki

EMR typically shows up as the thing you were running Spark on before something changed (a scale-out, a cost crunch, a move to a managed warehouse or a specialist engine like systems/ray). In the Amazon BDT Spark → Ray story, the Spark compactor ran on EMR clusters; Ray clusters run directly on EC2 (via the serverless job management substrate BDT built on top of systems/dynamodb + systems/aws-sns + systems/aws-sqs + systems/aws-s3).

In Slack's 2026-05-05 SSH-deprecation retrospective, EMR is the substrate underneath an org-wide modernisation: 700+ production jobs across 8 independent data regions had been orchestrated by Airflow SSH-ing into the EMR master node. The 3-quarter migration to a single REST gateway (Quarry) over YARN + YARN Distributed Shell is the wiki's first end-to-end retrospective on eliminating direct SSH access to EMR clusters at industrial scale — the unblocker for EMR-on-EKS adoption and for Slack's "Whitecastle" main-account → child-account migration. Two latent failure modes surfaced: concepts/master-node-resource-contention (jobs running on the master rather than distributed across NodeManagers) and concepts/resource-enforcement-bypass-via-ssh (vmem-check failures previously hidden — fixed via AWS-recommended yarn.nodemanager.vmem-check-enabled: false).

Seen in

Last updated · 542 distilled / 1,571 read