CONCEPT Cited by 1 source
Neutral vs non-neutral parameter¶
Definition¶
In the context of URL normalisation, a query parameter is:
- Neutral — removing it does not change the content of the page.
Safe to strip. Examples (on typical product pages):
utm_source,utm_medium,session,sid,ref,click_id,tracking. - Non-neutral — removing it does change the content of the page.
Must be preserved during normalisation. Examples:
id(which product),color(which variant),page(pagination),sort(ordering).
The classification is per-(domain, query-parameter-pattern, parameter)
triple, not per-parameter-name globally — see concepts/query-parameter-pattern
for the canonical Pinterest ref-on-product-vs-compare-page example
illustrating why.
Operational definition¶
Pinterest's MIQPS operationalises the classification as an empirical removal-test (Source: sources/2026-04-20-pinterest-smarter-url-normalization-at-scale-how-miqps-powers-content-deduplication):
"If removing a query parameter changes the content of a page, that parameter is important; if it doesn't, the parameter is noise and can be safely stripped."
Test procedure:
- Sample up to S URLs with distinct values for the parameter under test.
- For each sample, compute the content ID of the original URL and of the URL with the parameter removed.
- If the content IDs differ in ≥T% of samples, classify as non-neutral; else neutral.
Why the asymmetry matters¶
The two classification-failure modes have asymmetric costs:
- Dropping a non-neutral parameter by mistake — silently merges distinct items (e.g. blue shoes and red shoes collapse to "shoes"), corrupting catalog identity. Catastrophic.
- Keeping a neutral parameter by mistake — wastes a render or a dedup slot. Tolerable.
Every design decision in MIQPS and in the surrounding multi-layer normalisation strategy biases toward the tolerable failure:
- Conservative default: fewer than N samples → treat as non-neutral.
- Anomaly detection (concepts/anomaly-gated-config-update): flipping a parameter from non-neutral → neutral in a new MIQPS run is the only change counted as an anomaly; new non-neutral entries and pattern disappearances are not anomalies.
- Multi-layer OR semantics: a parameter is kept if any layer votes keep, stripped only if all layers agree it's safe.
Distinction from tracking parameter¶
Tracking parameter is a term-of-art for the specific sub-class of
parameters merchants add for analytics (utm_*, fbclid, gclid,
ref). All tracking parameters are usually neutral, but not all
neutral parameters are tracking parameters (e.g. test=true might be
neutral for a given page but not a tracking tag). MIQPS doesn't care
about the distinction — it classifies on behaviour, not on semantic
labels.
Generalisation¶
The neutral / non-neutral distinction generalises to any component of a composite identifier where the question is "does this component affect the underlying thing?":
- HTTP header → response body — do we get the same bytes if we remove this header?
- Request body field → computed response — does this field change the answer?
- Configuration key → runtime behaviour — does toggling this key change observable behaviour?
In every case the same operational test applies: remove the component, re-run the pipeline, compare the output fingerprints.
Seen in¶
- sources/2026-04-20-pinterest-smarter-url-normalization-at-scale-how-miqps-powers-content-deduplication — canonical wiki introduction via Pinterest's MIQPS.