SYSTEM Cited by 1 source
Qwen¶
Qwen is the open-weight LLM family from Alibaba Cloud — first released 2023, with subsequent Qwen 1.5 / Qwen 2 / Qwen 2.5 / Qwen 3 iterations spanning model sizes from 0.5B to 72B+ parameters. Variants include base, instruction-tuned, long- context, multi-modal (Qwen-VL), and code-specialised (Qwen-Coder) releases.
Qwen is one of the canonical open-weight base-model families available for fine-tuning production LLMs via LoRA adapters or full-parameter SFT — alongside Llama family as the two most-commonly-named options in Western production retrospectives.
Why it shows up in production retrospectives¶
- Fine-tune-friendly. Base + instruct variants cover both supervised fine-tuning and instruction-following deployment shapes.
- Range of sizes. Qwen ships from 0.5B (edge / single-GPU serving) up to 72B+ (premium quality tier) — production teams pick the sweet spot for their latency + cost budget.
- Strong English + Chinese + multilingual. Particularly strong at zero-shot Chinese; production choice for companies serving CJK markets.
- LoRA / QLoRA ecosystem. Broad support across the open- source LoRA toolchain (peft / axolotl / unsloth).
- Open-source license. Permits commercial deployment, which Llama's license restricts at scale historically (Llama 3.1 / 3.2 eased this).
Seen in¶
- sources/2026-02-26-instacart-our-early-journey-to-transform-discovery-recommendations-with-llms — Instacart's generative recommendations platform Phase-2 retrieval-keyword generation runs a teacher-student fine-tune where the student base model is selected via ablations across the Llama and Qwen families. Specific Qwen version + rank + data augmentation winner not disclosed. First wiki disclosure of Qwen as a production base- model choice at Instacart.
Caveats¶
- Instacart's post names Qwen at the family level, not a specific Qwen version or size.
- No production accuracy / latency / cost numbers for the Qwen variant disclosed.
- Relationship to the sibling Intent Engine (which used Llama-3-8B) not discussed — different task, different base-model ablation set.
Related¶
- systems/llama-3-1 — sibling open-weight family, jointly ablated at Instacart.
- systems/llama-3 — used by sibling Instacart system (Intent Engine) as its fine-tuned student base.
- systems/instacart-generative-recommendations-platform — canonical consumer.
- concepts/lora-low-rank-adaptation — the fine-tuning mechanism named in the ablation.
- patterns/teacher-student-model-compression — the shape the Qwen student realises.
- companies/instacart