Skip to content

SYSTEM Cited by 1 source

NVIDIA A10

The NVIDIA A10 is an Ampere-generation data-centre GPU (released 2021). Relative to A100/H100 frontier parts it has fewer and slower cores and less memory, but it is capable enough for mainstream inference workloads and ships in a passively-cooled, rack-friendly form factor suitable for general-purpose servers.

Seen in (wiki)

  • Fly.io 2024-08-15the most popular GPU on Fly.io by a wide margin, ahead of the L40S and both A100 SKUs. Fly.io's framing: "It's the least capable GPU we offer. But that doesn't matter, because it's capable enough. It's solid for random inference tasks, and handles mid-sized generative AI stuff like Mistral Nemo or Stable Diffusion." Supply was the bottleneck — "we can't get new A10s in fast enough for our users." The A10 price ($1.25/hour) is the anchor the L40S got discounted to. (Source: sources/2024-08-15-flyio-were-cutting-l40s-prices-in-half)

Why it matters

  • "Capable enough" is an architectural threshold, not a spec line. For transaction-shaped inference (respond to an HTTP request with a model output), an A10 combined with fast instance RAM and local object storage can beat an A100 bolted to slow storage on the latency + $/request axis. The Fly.io datum is the wiki's canonical instance of the inference-vs-training workload-shape distinction.
  • The "cheap inference GPU" price point is a product-shape driver. Fly.io's whole 2024-08-15 L40S price cut is engineered to collapse the A10-vs-something-bigger choice by pricing the L40S at A10 levels — i.e. the A10 is the price anchor customers wouldn't deviate from.
Last updated · 200 distilled / 1,178 read