SYSTEM Cited by 1 source

Nx (Elixir AI/ML stack)¶

The Nx stack is how Elixir does native AI/ML — three libraries layered on top of each other:

Nx — Elixir-native tensor computation with pluggable GPU backends (EXLA / Torchx etc.). Comparable in role to NumPy + JAX/PyTorch for Python.
Axon — a common interface for ML models built on Nx. Comparable to Keras or a lightweight PyTorch-layer API.
Bumblebee — a model registry that makes pre-trained models downloadable and usable "from just a couple lines of code" inside any Elixir app. Comparable to Hugging Face's transformers library.

Key architectural properties¶

GPU backends inherit Elixir's distribution story. Because Nx tensors live inside a BEAM process, a tensor allocated locally can be transferred to a remote GPU node simply by messaging it — the Fly.io 2024-09-24 post opens with a video demo of transferring a local tensor to a remote GPU via Livebook + FLAME + Nx. (Source: sources/2024-09-24-flyio-ai-gpu-clusters-from-your-laptop-with-livebook)
Composes with FLAME for elastic GPU compute. The end-to-end pattern in the keynote — send stills to Llama on a pool of L40S Fly Machines, then to Mistral for summary — uses Bumblebee to load each model and FLAME to dispatch per-still/per-video work across the GPU pool.
BERT compile-and-fine-tune per node. The 64-node hyperparameter-tuning demo compiles a different BERT variant on each node using the Nx compiler chain; Axon expresses the fine-tuning loop; results stream back to Livebook.

sources/2024-09-24-flyio-ai-gpu-clusters-from-your-laptop-with-livebook — canonical wiki instance; both the Llama/Mistral inference pipeline and the BERT hyperparameter-tuning training workload use the Nx/Axon/Bumblebee stack.

systems/livebook — typical driver for Nx notebooks.
systems/flame-elixir — the elastic-executor framework used to distribute Nx compute over a GPU cluster.
systems/erlang-vm — BEAM hosts Nx processes; native message passing is how tensors cross nodes.
systems/fly-machines — GPU-enabled Machines are the compute substrate in the canonical demo.
systems/nvidia-l40s — the GPU shape used in the 64-node BERT hyperparameter-tuning demo.
systems/llama-3-1 — one of the named Bumblebee-loaded models in the video-summariser pipeline.