SYSTEM Cited by 1 source
Apache Hadoop¶
Apache Hadoop is the canonical open-source batch-processing ecosystem: HDFS for distributed storage, MapReduce / YARN for job scheduling, plus a broader ecosystem (Hive, HBase, etc.).
Stub page. See the 2024-12-12 Stripe source for one canonical operational datum: Hadoop jobs on a multi-thousand-node cluster can concentrate unusual workloads (e.g. reverse-DNS resolution of every IP in network-activity logs) into a short burst that saturates downstream infrastructure — here, Stripe's central DNS server cluster hitting the AWS VPC-resolver 1,024-pps-per-ENI cap during the hourly runs of a log-analysis job.