Skip to content

PATTERN Cited by 1 source

Framework-managed executor pool

Give a block of code to a framework; the framework manages a pool of executor nodes against a declared min/max/concurrency policy, and runs your block on one of them. The caller writes the code inline as if it were local; the framework handles provisioning, placement, shutdown, and (optionally) cluster termination on driver disconnect.

Shape

  • Call-site API is a single primitive. Flame.call { ... } is the canonical example on FLAME. No decomposition into serverless functions, no explicit RPC layer, no job-queue bookkeeping.
  • Pool configured declaratively. Minimum / maximum instance count + per-instance concurrency + idle timeout. The framework scales the pool between the bounds based on inbound work.
  • Per-node idle shutdown. Each executor shuts down when it hasn't received new work for the configured idle window.
  • Full teardown on driver disconnect. If the driving process (Livebook runtime, application process, etc.) disconnects, the pool terminates entirely — no orphan capacity.
  • Substrate-pluggable. The pool manager talks to a cloud's compute API; FLAME originally targeted Fly Machines, and Michael Ruoss ported it to Kubernetes Pods — the pool-management logic is identical.

Why it differs from conventional serverless

Dimension Conventional serverless Framework-managed executor pool
Unit of deployment Individual function Whole application
Caller-side ergonomics Explicit invoke / HTTP call Inline call { ... }
State sharing with caller Requires external store Same process tree
Configuration location Per-function Per-pool at app level
Cold-start cost Per-function Pool-level warm-up

The trade-off summarised by Fly.io:

"It's the upside of serverless without committing yourself to blowing your app apart into tiny, intricately connected pieces."

— (Source: sources/2024-09-24-flyio-ai-gpu-clusters-from-your-laptop-with-livebook)

Load-bearing runtime properties

The pattern's ergonomics rely on the host runtime providing:

  • Transparent remote execution. The call { ... } block should behave identically whether it ran locally or on an executor. BEAM's distributed-Erlang + concepts/transparent-cluster-code-distribution do this in FLAME's case.
  • Clean per-executor lifecycle. Language/runtime needs to support clean shutdown (and optionally hot spawn) of per-executor processes so idle-timeout shutdown is safe.

Seen in

Use cases in the canonical source

  • Inline ffmpeg video encoding calls (original FLAME motivating example).
  • AI inference across a pool of GPU Fly Machines (Llama summarising video stills).
  • Hyperparameter-tuning fan-out across 64 L40S GPU Fly Machines, each compiling a different BERT variant.
Last updated · 200 distilled / 1,178 read