SYSTEM Cited by 1 source

Qwen¶

Qwen is the open-weight LLM family from Alibaba Cloud — first released 2023, with subsequent Qwen 1.5 / Qwen 2 / Qwen 2.5 / Qwen 3 iterations spanning model sizes from 0.5B to 72B+ parameters. Variants include base, instruction-tuned, long- context, multi-modal (Qwen-VL), and code-specialised (Qwen-Coder) releases.

Qwen is one of the canonical open-weight base-model families available for fine-tuning production LLMs via LoRA adapters or full-parameter SFT — alongside Llama family as the two most-commonly-named options in Western production retrospectives.

Why it shows up in production retrospectives¶

Fine-tune-friendly. Base + instruct variants cover both supervised fine-tuning and instruction-following deployment shapes.
Range of sizes. Qwen ships from 0.5B (edge / single-GPU serving) up to 72B+ (premium quality tier) — production teams pick the sweet spot for their latency + cost budget.
Strong English + Chinese + multilingual. Particularly strong at zero-shot Chinese; production choice for companies serving CJK markets.
LoRA / QLoRA ecosystem. Broad support across the open- source LoRA toolchain (peft / axolotl / unsloth).
Open-source license. Permits commercial deployment, which Llama's license restricts at scale historically (Llama 3.1 / 3.2 eased this).

Seen in¶

sources/2026-02-26-instacart-our-early-journey-to-transform-discovery-recommendations-with-llms — Instacart's generative recommendations platform Phase-2 retrieval-keyword generation runs a teacher-student fine-tune where the student base model is selected via ablations across the Llama and Qwen families. Specific Qwen version + rank + data augmentation winner not disclosed. First wiki disclosure of Qwen as a production base- model choice at Instacart.

Caveats¶

Instacart's post names Qwen at the family level, not a specific Qwen version or size.
No production accuracy / latency / cost numbers for the Qwen variant disclosed.
Relationship to the sibling Intent Engine (which used Llama-3-8B) not discussed — different task, different base-model ablation set.

systems/llama-3-1 — sibling open-weight family, jointly ablated at Instacart.
systems/llama-3 — used by sibling Instacart system (Intent Engine) as its fine-tuned student base.
systems/instacart-generative-recommendations-platform — canonical consumer.
concepts/lora-low-rank-adaptation — the fine-tuning mechanism named in the ablation.
patterns/teacher-student-model-compression — the shape the Qwen student realises.
companies/instacart

Qwen¶

Why it shows up in production retrospectives¶

Seen in¶

Caveats¶

Related¶