CONCEPT Cited by 1 source
Category-attribute relevance mapping¶
Definition¶
A category-attribute relevance mapping is an internal business-rule layer that encodes which product attributes should be shown (or extracted) for which product categories. It is a scope filter applied before any extraction logic runs: irrelevant attributes for this category are removed from consideration entirely, rather than being extracted and then suppressed downstream.
For an LLM-based attribute extraction pipeline, this filter runs between the attribute schema (all attributes this article type could have) and the prompt (only the attributes this article type should have, given its category). The output is a reduced attribute set that feeds the prompt generator.
Why this is quality-through-scope-reduction, not a¶
simple validation step
Two distinct effects:
- Guideline enforcement. Some attributes are
prohibited for certain article types per internal style
guides — e.g.
sleeve_lengthfor a pair of shoes makes no sense and should never appear. This is a schema- correctness concern. - Model-accuracy salvage. Even for attributes that are technically permitted for a category, the LLM's accuracy on that (attribute × category) pair may be empirically poor. Rather than invest in model sophistication or fine-tuning, the mapping removes the attribute from the prompt for that category. This is a quality-through-scope-reduction lever — you can't have a bad prediction for a field you never ask about.
The second effect is the architecturally interesting one: it treats the category-relevance map as a quality knob, not just a schema correctness layer.
Mechanism¶
Inputs:
- article_type (from the article's catalog record).
- attribute_set (all attributes defined for this article
type in the Article
Masterdata).
Rules table (example shape):
| article_type | attribute | status |
|---|---|---|
| shoes | sleeve_length | excluded (not applicable) |
| shoes | heel_height | included |
| dress | neckline | included |
| dress | fabric_composition | excluded (low model accuracy) |
| ... | ... | ... |
Output:
- Reduced attribute_set' passed to the prompt generator.
The mapping is authored by domain experts (content-creation guidelines owners) and evolves based on production accuracy data — if a given (category, attribute) pair consistently underperforms, it gets demoted to excluded; if a new attribute is onboarded and performs well on a category, it gets promoted.
Tradeoffs¶
- Coverage loss. Attributes excluded for accuracy reasons are not filled at all (by the copilot), so manual enrichment is still needed for those (category, attribute) cells. Scope reduction buys quality by sacrificing coverage — acceptable because the humans are still in the loop anyway, but not a free win.
- Map staleness. Rules encoded once can rot as the model improves; a (category, attribute) pair that was low-accuracy on GPT-4 Turbo may be high-accuracy on GPT-4o. Revisiting the map post-backend-swap is necessary.
- Cell-level granularity vs. category-level. Some attribute × category combinations are mixed — high accuracy on one sub-category, low on another. The map may need hierarchical categories or feature-level conditions to stay accurate without over-excluding.
- Discoverability. Copywriters seeing no AI-suggestion for an attribute they know exists in the schema may assume a bug. The purple-dot indicator goes absent for excluded attributes, which is correct but needs docs.
Relation to prompt batching¶
The patterns/multi-attribute-multi-product-prompt-batching pattern (send N attributes in one prompt) interacts with this filter: the post-filter attribute list is the thing that's batched, so a well-curated relevance map directly reduces per-call token cost proportional to removed attributes.
Seen in¶
- sources/2024-09-17-zalando-content-creation-copilot-ai-assisted-product-onboarding — canonical wiki instance. Zalando's Content Creation Copilot uses a category-attribute relevance mapping both for guideline enforcement and as an explicit accuracy-salvage mechanism. Post disclosure: "some attributes shouldn't be filled for certain types of articles according to the internal guidelines, and the accuracy of predicted suggestions for these attributes was often poor. To address this, we introduced a mapping layer between product categories and the relevant information that should be shown to the customer."
Related¶
- systems/zalando-prompt-generator — where the map is applied in Zalando's architecture
- systems/zalando-article-masterdata — schema source the map filters over
- systems/zalando-content-creation-copilot
- concepts/opaque-attribute-code-translation-layer — sibling filter (different concern: vocabulary, not scope)
- concepts/input-image-selection-tradeoff — sibling filter (different axis: image set, not attribute set)
- patterns/llm-attribute-extraction-platform — platform pattern this concept lives inside