PATTERN Cited by 1 source
Model-agnostic suggestion aggregator¶
Intent¶
Sit an aggregator service in front of one or more AI / ML / data backends so that the consumer (UI, downstream service) is bound to the aggregator contract, not to any specific backend. New backends can be added, swapped, or cascaded without changing the consumer code. The aggregator collapses heterogeneous suggestion sources (frontier LLMs, fine-tuned models, partner data dumps, rule-based systems, human-curated caches) into a uniform "here is a suggested value; here is its provenance marker" contract.
The pattern is structurally the façade + adapter pair applied to AI backends, with the explicit product commitment that the set of backends is expected to change and the UI must not leak which backend produced which suggestion.
When to use¶
- The UI / downstream consumer should show AI suggestions but must not be coupled to a single model provider — either because you expect model choice to evolve faster than UI (typical in a fast-moving LLM space) or because you expect multiple complementary backends long-term.
- You anticipate mixed backends: a general LLM for wide-coverage attributes, a fine-tuned model for specific hard attributes, partner data dumps for ground-truth attributes, caches for repeated queries.
- You want cost / latency / accuracy tradeoffs to be a backend-layer concern, not a UI concern — cascade routing happens behind the aggregator.
Mechanism¶
┌──────────────────────────────────────────┐
│ UI / downstream consumer │
│ (sees: value + AI-provenance marker) │
└────────────────┬─────────────────────────┘
│ stable aggregator contract
▼
┌──────────────────────────────────────────┐
│ Aggregator service │
│ - receives request (product, attribute) │
│ - picks / cascades backend(s) │
│ - normalises response │
│ - returns with uniform provenance marker│
└─┬──────────┬──────────┬──────────┬───────┘
│ │ │ │
▼ ▼ ▼ ▼
GPT-4o Fine-tuned Brand Partner
(VLM) classifier data catalog
dump feed
Key design properties:
- Stable upstream contract. Consumer-facing API is
uniform:
(entity, attribute) → (suggested_value, provenance_marker). Changes to the backend set are invisible above this line. - Uniform provenance marker. Regardless of which backend produced a suggestion, the UI-facing indicator (the purple dot, a banner, etc.) is the same binary: AI vs. human. The consumer does not know it was GPT-4o vs. brand data. Backend identity, if needed for audit, is a separate internal field.
- Backend selection is internal. Cascade logic, retry-on-low-confidence, partner-data-first-fall-back- to-LLM — all live inside the aggregator. The consumer is not asked to reason about which backend to hit.
- Hot-swap supported. Swapping a backend in place (e.g. GPT-4 Turbo → GPT-4o) must not change the consumer. This is the test of the pattern: if the swap leaks upward, the aggregator contract has been broken.
- Additive backends supported. Adding a new backend is a config change, not a consumer change.
Why this is the right frame for copilots¶
Copilots (IDE code-suggestion, email smart-compose, catalog copywriting assistants) share the property that:
- Model choice evolves quickly. New frontier models ship every few months; the right model today is not the right model next quarter.
- Hybrid backends are natural. Some signals are better from LLMs (general knowledge, reasoning); some from classical ML (narrow-task classifiers); some from data (partner feeds, internal ground truth). A coherent copilot stitches all three.
- UI is expensive to change, backends are cheap. UI changes trigger retraining of human users; backend changes don't.
The aggregator is the abstraction that absorbs model churn on the engineering side without forcing UI or user retraining.
Trade-offs / gotchas¶
- Aggregator becomes the contract cliff. The aggregator API itself is now hard to change — every backend and every consumer is bound to it. Versioning strategy (breaking-change SLA, backwards-compatible evolution) is a first-class concern.
- Backend heterogeneity can leak via quality. Even if the contract is uniform, different backends have different accuracy, latency, and failure-mode profiles. Consumers may notice "accuracy got worse Thursday" and the aggregator should expose per-backend health metrics to internal operators even if those metrics don't leak to the UI.
- Uniform provenance marker hides useful detail. Power users (QA reviewers, data scientists) may want to know which backend produced a suggestion; the pattern's "uniform marker to consumer" discipline needs a separate internal-audit channel for that.
- Cascade logic is the hard part. Deciding "if the cheap backend returns low confidence, fall through to the expensive one" requires a confidence primitive — which not all backends provide uniformly. Zalando's post doesn't disclose a confidence primitive, so its aggregator today likely just calls one backend per request; true cascade requires more.
- Debugging is aggregator-first. "The UI shows a wrong suggestion" now requires tracing through the aggregator's backend choice, backend health, normalisation logic — more indirection than a direct-call pipeline.
Related patterns¶
- patterns/llm-attribute-extraction-platform — Instacart's PARSE is a more platformised version of the same stance, with self-serve UI for prompt + backend choice baked in. Zalando's copilot is the thinner, single-team instance of the pattern; PARSE is the platform-level instance.
- patterns/pre-select-ai-suggestions-with-visual-disclosure — the UI-side pattern that requires the aggregator's uniform provenance marker to work.
- concepts/llm-cascade — the cost/quality-routing logic inside the aggregator.
- concepts/ai-provenance-ui-indicator — the uniform marker that the aggregator commits to populating correctly regardless of backend.
Seen in¶
- sources/2024-09-17-zalando-content-creation-copilot-ai-assisted-product-onboarding — canonical wiki instance. Zalando's Content Creation Copilot is explicitly framed as an aggregator: "we created an aggregator service - to integrate multiple AI services, leveraging a wider variety of data sources, such as brand data dumps, partner contributions, and images, to improve the accuracy and completeness of the results." Validated once by the GPT-4 Turbo → GPT-4o migration during development — a backend swap that did not require changes to the Content Creation Tool.
Related¶
- systems/zalando-content-creation-copilot — canonical production instance
- systems/zalando-prompt-generator — the component where backend selection / fallback lives
- systems/zalando-content-creation-tool — consumer bound to the stable aggregator contract
- systems/gpt-4 / systems/gpt-4o — backend instances in Zalando's case
- concepts/ai-provenance-ui-indicator
- concepts/opaque-attribute-code-translation-layer
- patterns/llm-attribute-extraction-platform
- patterns/pre-select-ai-suggestions-with-visual-disclosure