Skip to content

CONCEPT Cited by 1 source

Neutral vs non-neutral parameter

Definition

In the context of URL normalisation, a query parameter is:

  • Neutral — removing it does not change the content of the page. Safe to strip. Examples (on typical product pages): utm_source, utm_medium, session, sid, ref, click_id, tracking.
  • Non-neutral — removing it does change the content of the page. Must be preserved during normalisation. Examples: id (which product), color (which variant), page (pagination), sort (ordering).

The classification is per-(domain, query-parameter-pattern, parameter) triple, not per-parameter-name globally — see concepts/query-parameter-pattern for the canonical Pinterest ref-on-product-vs-compare-page example illustrating why.

Operational definition

Pinterest's MIQPS operationalises the classification as an empirical removal-test (Source: sources/2026-04-20-pinterest-smarter-url-normalization-at-scale-how-miqps-powers-content-deduplication):

"If removing a query parameter changes the content of a page, that parameter is important; if it doesn't, the parameter is noise and can be safely stripped."

Test procedure:

  1. Sample up to S URLs with distinct values for the parameter under test.
  2. For each sample, compute the content ID of the original URL and of the URL with the parameter removed.
  3. If the content IDs differ in ≥T% of samples, classify as non-neutral; else neutral.

Why the asymmetry matters

The two classification-failure modes have asymmetric costs:

  • Dropping a non-neutral parameter by mistake — silently merges distinct items (e.g. blue shoes and red shoes collapse to "shoes"), corrupting catalog identity. Catastrophic.
  • Keeping a neutral parameter by mistake — wastes a render or a dedup slot. Tolerable.

Every design decision in MIQPS and in the surrounding multi-layer normalisation strategy biases toward the tolerable failure:

  • Conservative default: fewer than N samples → treat as non-neutral.
  • Anomaly detection (concepts/anomaly-gated-config-update): flipping a parameter from non-neutral → neutral in a new MIQPS run is the only change counted as an anomaly; new non-neutral entries and pattern disappearances are not anomalies.
  • Multi-layer OR semantics: a parameter is kept if any layer votes keep, stripped only if all layers agree it's safe.

Distinction from tracking parameter

Tracking parameter is a term-of-art for the specific sub-class of parameters merchants add for analytics (utm_*, fbclid, gclid, ref). All tracking parameters are usually neutral, but not all neutral parameters are tracking parameters (e.g. test=true might be neutral for a given page but not a tracking tag). MIQPS doesn't care about the distinction — it classifies on behaviour, not on semantic labels.

Generalisation

The neutral / non-neutral distinction generalises to any component of a composite identifier where the question is "does this component affect the underlying thing?":

  • HTTP header → response body — do we get the same bytes if we remove this header?
  • Request body field → computed response — does this field change the answer?
  • Configuration key → runtime behaviour — does toggling this key change observable behaviour?

In every case the same operational test applies: remove the component, re-run the pipeline, compare the output fingerprints.

Seen in

Last updated · 319 distilled / 1,201 read