CONCEPT Cited by 1 source
Task and actor model¶
The task and actor model is a low-level distributed-compute programming model that exposes two orthogonal primitives:
- Task — a stateless, serialisable unit of remote function invocation. Inputs are references / values; output is a future reference. Scheduled anywhere in the cluster that has capacity.
- Actor — a stateful, long-lived remote class instance with its own addressable identity. Has an internal mutable state; methods called on it run on the actor's home worker serially (or with explicit concurrency control).
The model is deliberately closer to a distributed runtime than to a dataflow DAG. It is what systems/ray exposes; it is also the shape Amazon's in-house distributed compute service offered (though with usability and efficiency issues that ruled it out for BDT's compactor workload).
Why "low-level" is worth it¶
Higher-level engines (Spark dataflows, SQL planners, Beam pipelines) compose operators over structured data and give the runtime optimisation freedom. That's a great deal for general-purpose ETL and SQL-shaped work.
But for specialist workloads — large-scale compaction, ML training loops, per-request micro-pipelines, locality-sensitive shuffles — the dataflow abstraction hides the mechanisms you need to hand-craft the distributed algorithm. Tasks + actors give back:
- Direct control over placement (same node, same rack, or anywhere).
- Direct control over object locality and what's shuffled vs referenced in place.
- Direct expression of stateful patterns — caches, checkpoints, coordinators — without trying to squeeze them into a stateless operator graph.
- Direct handling of heterogeneous workers and resource demands (e.g. memory-aware scheduling, GPU affinity).
Amazon BDT's framing: "BDT didn't need a generalist for the compaction problem, but a specialist, and Ray let them narrow their focus down to optimising the specific problem at hand." That's the task-and-actor sell.
(Source: sources/2024-07-29-aws-amazons-exabyte-scale-migration-from-apache-spark-to-ray-on-ec2)
Downsides to name¶
- Application complexity is yours. No free query optimiser; no automatic operator placement; no free retry logic.
- Debugging distributed state is harder than debugging a dataflow DAG — actors have history.
- Portability is worse than SQL. Dataflow engines translate across clusters; a task/actor program is a specific distributed algorithm.
Lineage¶
The pattern predates Ray — Erlang/OTP, Akka, Orleans, Dask, and internal cluster frameworks at the hyperscalers all expose some form of actors or tasks. Ray's articulation — tasks + actors + object store + locality scheduler — is the current canonical framing for general-purpose distributed compute outside the dataflow world.
Related¶
- systems/ray — the canonical implementation.
- concepts/locality-aware-scheduling — the scheduler-side dual that makes fine-grained tasks actually fast.
- concepts/memory-aware-scheduling — resource-dimension hint used by task-level scheduling.
- concepts/stateless-compute — what the task half of the model is; actors are the stateful escape hatch.
Seen in¶
- sources/2024-07-29-aws-amazons-exabyte-scale-migration-from-apache-spark-to-ray-on-ec2 — "distributed stateless functions (tasks) and distributed stateful classes (actors)" named as the programming-model primitives that drew Amazon BDT to Ray; same pair existed in their in-house compute service but couldn't be used due to usability + overhead + no autoscaling.