SYSTEM Cited by 1 source
Nx (Elixir AI/ML stack)¶
The Nx stack is how Elixir does native AI/ML — three libraries layered on top of each other:
- Nx — Elixir-native tensor computation with pluggable GPU backends (EXLA / Torchx etc.). Comparable in role to NumPy + JAX/PyTorch for Python.
- Axon — a common interface for ML models built on Nx. Comparable to Keras or a lightweight PyTorch-layer API.
- Bumblebee — a model
registry that makes pre-trained models downloadable and usable
"from just a couple lines of code" inside any Elixir app.
Comparable to Hugging Face's
transformerslibrary.
Key architectural properties¶
- GPU backends inherit Elixir's distribution story. Because Nx tensors live inside a BEAM process, a tensor allocated locally can be transferred to a remote GPU node simply by messaging it — the Fly.io 2024-09-24 post opens with a video demo of transferring a local tensor to a remote GPU via Livebook + FLAME + Nx. (Source: sources/2024-09-24-flyio-ai-gpu-clusters-from-your-laptop-with-livebook)
- Composes with FLAME for elastic GPU compute. The end-to-end pattern in the keynote — send stills to Llama on a pool of L40S Fly Machines, then to Mistral for summary — uses Bumblebee to load each model and FLAME to dispatch per-still/per-video work across the GPU pool.
- BERT compile-and-fine-tune per node. The 64-node hyperparameter-tuning demo compiles a different BERT variant on each node using the Nx compiler chain; Axon expresses the fine-tuning loop; results stream back to Livebook.
Seen in¶
- sources/2024-09-24-flyio-ai-gpu-clusters-from-your-laptop-with-livebook — canonical wiki instance; both the Llama/Mistral inference pipeline and the BERT hyperparameter-tuning training workload use the Nx/Axon/Bumblebee stack.
Related¶
- systems/livebook — typical driver for Nx notebooks.
- systems/flame-elixir — the elastic-executor framework used to distribute Nx compute over a GPU cluster.
- systems/erlang-vm — BEAM hosts Nx processes; native message passing is how tensors cross nodes.
- systems/fly-machines — GPU-enabled Machines are the compute substrate in the canonical demo.
- systems/nvidia-l40s — the GPU shape used in the 64-node BERT hyperparameter-tuning demo.
- systems/llama-3-1 — one of the named Bumblebee-loaded models in the video-summariser pipeline.