Skip to content

eBay

eBay Tech — innovation.ebayinc.com — is a Tier 3 source on the sysdesign-wiki. eBay's innovation blog publishes heavily in seller-feature PR, buyer-feature PR, recruiting, executive positioning, corporate policy, and generic GenAI-tooling genres — all of which fall below the Tier-3 ingest bar. The distributed-systems / scaling / infra-architecture / production-incidents content eBay occasionally produces is what qualifies.

Prior to 2025-01-17, this company had zero on-scope ingests across the Dec-2023 → 2024 window (12+ consecutive skips logged in the batch window).

Key systems

  • systems/e-llama — eBay's Llama-3.1-derived 8B + 70B LLM, continued-pretrained on 1 trillion tokens of mixed e-commerce + replay data on a 480-H100 Megatron-LM 3D-parallel training cluster. ~25% / ~30% English / non-English e-commerce-benchmark gain; ~1% general-domain regression (70B). Sister to the from-scratch LiLiuM family.

Key patterns / concepts

Recent articles

Recurring themes

  • Hybrid LLM strategy. eBay runs from-scratch (LiLiuM) and adapt-existing-model (e-Llama) LLM tracks simultaneously — build for control, adapt for velocity.
  • Training-infrastructure transparency > serving-infrastructure transparency. The e-Llama post is detailed on training topology (480 H100s / Megatron-LM 3D parallelism / NVLink+InfiniBand / 1T tokens / hyperparameters) but silent on serving — no inference backend, no per-query latency, no QPS, no cost economics, no product-surface integration. Training-infra disclosure is the thing eBay's innovation blog does when it does go technical.

Ingest posture

Apply Tier 3 filter strictly. Skip seller-feature PR, buyer-feature PR, recruiting / competition posts, executive interviews / awards, corporate policy / responsible-AI principles posts, and generic GenAI-tooling announcements that name models but don't describe serving infrastructure.

Ingest posts that clearly cover: distributed-systems internals, scaling trade-offs, infra architecture, production incidents, storage / networking / streaming design, or training-infrastructure deep-dives at frontier scale (as per the 2025-01-17 e-Llama post). Model-family name-drops alone are insufficient — need architectural depth (parallelism topology, hyperparameter methodology, hardware topology, benchmark methodology, cost numbers) to clear the Tier-3 bar.

Watch for a future serving-infra deep-dive on e-Llama (inference backend, per-query latency, QPS, cost-per-token, product-surface integration) — that post would anchor systems/e-llama as a full serving-plus-training system page and warrant additional concept/pattern pages. Similarly a LiLiuM-from-scratch deep-dive would anchor a sister systems/lilium page.

Last updated · 200 distilled / 1,178 read