SYSTEM Cited by 1 source
Llama 3.1¶
Llama 3.1 is Meta's 2024 open-weights foundation-model family (8B, 70B, 405B parameters). In the context of this wiki it is notable as the adaptation base for domain-adapted enterprise LLMs.
Seen in (wiki)¶
- eBay e-Llama. eBay takes Llama 3.1 8B + 70B as the base for continued pretraining on 1 trillion tokens of mixed e-commerce + general data — producing e-Llama. The continued-pretraining recipe uses a max LR set to 10% of the original Llama 3.1 max LR, a 1:1 general-to-domain sampling ratio, and includes replay data from curated/public/open-source corpora to resist catastrophic forgetting. Result: ~25% gain on English e-commerce benchmarks, ~30% on non-English, with ~1% general-domain regression for the 70B model. (Source: sources/2025-01-17-ebay-scaling-large-language-models-for-e-commerce-the-development)
Why "adapt" rather than "train from scratch"¶
From eBay's framing:
"Training a large-scale LLM from scratch is a very time- and resource-intensive process. In order to move fast, one could use existing pretrained models, such as Llama 3.1, for their use cases. However, these models typically lack specific knowledge, in our case about the e-commerce domain."
Llama 3.1's role in an enterprise adaptation pipeline is thus: known-capable open base → continued-pretrain with domain data + replay → fine-tune + RLHF → deploy. Trades the time cost of a from-scratch build against the ceiling cost of starting from a model you didn't shape. Stub — expand as more adaptation sources cite Llama 3.1 as a base.
Related¶
- systems/e-llama — eBay's continued-pretrained derivative.
- concepts/continued-pretraining — the technique by which Llama 3.1 becomes e-Llama.
- concepts/catastrophic-forgetting — the failure mode managed via replay-training when continued-pretraining off a base like Llama 3.1.
- patterns/continued-pretraining-for-domain-adaptation — the end-to-end recipe.