SYSTEM Cited by 2 sources
GPT-4o¶
Definition¶
GPT-4o ("omni") is OpenAI's multi-modal flagship model announced 2024-05-13. Natively accepts text and image inputs (with audio support added later), produces text output. Positioned as a latency + cost + quality improvement over GPT-4 Turbo for general-purpose multi-modal workloads, and as the teacher model for its smaller fine-tunable sibling GPT-4o-mini.
Wiki anchor¶
The wiki's canonical anchor for GPT-4o is its role as the production VLM backend behind a catalog-attribute copilot, documented in the 2024-09-17 Zalando post (sources/2024-09-17-zalando-content-creation-copilot-ai-assisted-product-onboarding).
Zalando's Content Creation Copilot (systems/zalando-content-creation-copilot) launched on GPT-4 Turbo and migrated to GPT-4o during development. The swap was reported as a net improvement across three axes simultaneously: "The new model not only provided better results but also delivered faster response times and proved to be more cost-effective." Because the copilot was designed as an aggregator with stable contracts on either side, the swap did not require changes to the Content Creation Tool or Article Masterdata.
The post also names GPT-4o's empirical weakness: fine- grained fashion vocabulary. "GPT-4o model tends to suggest general attributes like 'V-necks' or 'round necks' for 'necklines' correctly, but can be less precise when it comes to more fashion-specific ones, like 'deep scoop necks'." This is characteristic of general-purpose VLMs on long-tail domain-specific vocabulary.
Tradeoffs¶
- Multi-modal inputs cost more per call than text-only. See concepts/multi-modal-attribute-extraction — image tokens are more expensive than text tokens, so the multi-modal path is reserved for inputs where the signal is actually in the image.
- Long-tail domain vocabulary underperforms. For fashion- specific terminology (specific neckline variants, niche assortment classes), accuracy drops. Zalando's response was not to fine-tune GPT-4o but to plan for complementary backends (brand data dumps, fine-tuned models, partner contributions) behind the same copilot contract.
- Balanced vs. unbalanced eval sets give different headline numbers. Zalando explicitly notes that the fine-grained weakness is more visible on balanced eval sets than on the real (unbalanced) production distribution — a trap when comparing model quality across benchmarks.
Seen in¶
- sources/2024-09-17-zalando-content-creation-copilot-ai-assisted-product-onboarding — canonical wiki instance; backend for Zalando's Content Creation Copilot after migration from GPT-4 Turbo. Empirically lower cost and latency than Turbo at equivalent-or-better accuracy on catalog-attribute extraction.
- sources/2025-02-19-zalando-llm-powered-migration-of-ui-component-libraries — bulk-code-migration instance. GPT-4o is the transformation backend for Zalando's Component Migration Toolkit (September 2024 onward). Used at temperature=0 for reproducibility, with a static/dynamic prompt partition for cache-hit maximisation. Reported ~90% accuracy on UI-library migration across 15 B2B applications; ~$40 per repository under 2024 pricing; 30–200s per file. Two disclosed failure modes: 4K output-token limit (recovered via "continue" prompt) and "moody" residual run-to-run variance (temperature=0 reduces but does not eliminate).
Related¶
- systems/gpt-4 — predecessor (and Turbo variant that preceded GPT-4o at Zalando)
- systems/gpt-4o-mini — smaller, fine-tunable sibling
- concepts/multi-modal-attribute-extraction — the concept the Zalando use case instantiates
- concepts/llm-cascade — the cost-routing pattern GPT-4o often sits at the top or middle of
- patterns/model-agnostic-suggestion-aggregator — the pattern that let Zalando swap GPT-4 Turbo → GPT-4o with no downstream contract changes