CONCEPT Cited by 1 source
Query parameter pattern¶
Definition¶
A query parameter pattern is the sorted set of parameter names present in a URL's query string, ignoring values. It is the grouping key that makes parameter classification tractable at scale: parameters can only be judged neutral-or-not in the context of which other parameters sit alongside them.
Example — these URLs have the same query parameter pattern
{color, id, utm_source}:
https://example.com/shoes?id=42&color=red&utm_source=facebook
https://example.com/shoes?id=99&color=blue&utm_source=twitter
While this URL has a different pattern {category, page, sort}:
https://example.com/shop?category=shoes&page=2&sort=price
Why pattern matters — the ref example¶
Pinterest's (Source: sources/2026-04-20-pinterest-smarter-url-normalization-at-scale-how-miqps-powers-content-deduplication) canonical example of why classifying parameters independent of pattern is wrong:
"Moreover, the same parameter name can play different roles depending on its context. Consider the parameter
ref: on a product page URL likeexample.com/product?id=42&ref=homepage,refis purely a tracking parameter and is neutral — removing it doesn't change the product displayed. But on a comparison page URL likeexample.com/compare?ref=99, the samerefparameter identifies which items to compare and is non-neutral. By grouping URLs by their full parameter pattern, the algorithm evaluates each parameter within its specific context, correctly classifying it as neutral in one pattern and non-neutral in another."
This is why the classification key in MIQPS
is (domain, pattern, parameter) — not (domain, parameter), and
certainly not just parameter.
Role in MIQPS¶
MIQPS groups observed URLs by query parameter pattern, then picks the top K patterns by URL count for analysis — "focusing computational resources on the patterns that matter most." Patterns below the top-K cutoff are not analysed; parameters in those URLs default to the conservative-default treatment (kept).
Within each pattern, each parameter is tested independently by the removal-test.
Why sorted + names-only¶
- Sorted — so
?a=1&b=2and?b=2&a=1(logically equivalent) produce the same pattern. - Names only, not values — because the pattern is a structural
grouping, not a content grouping.
?id=42&color=redand?id=99&color=bluebelong to the same pattern; they test what happens whencoloris removed from URLs of this shape.
Practical considerations¶
- Pattern count per domain — a domain with many page types (product, category, search, checkout) has many patterns. MIQPS caps analysis at top-K.
- Long-tail patterns — patterns appearing only a few times are dropped below the sample-size floor and hit the conservative default.
- Pattern drift — a domain changing URL structure between MIQPS runs produces "patterns can naturally disappear as a domain's URL structure evolves" — this is explicitly not flagged as an anomaly (concepts/anomaly-gated-config-update).
Generalisation¶
The concept generalises to any classifier-over-composite-objects where context determines semantics:
- Parameter-in-HTTP-header-set — what do authorisation headers mean when sitting alongside which others?
- Feature-in-feature-vector — does feature X predict label Y in the same way regardless of what other features are active?
- Option-in-option-set — does a CLI flag mean the same thing regardless of other flags passed?
In all cases, the insight is the same: don't classify the component in isolation when the composite is the unit of meaning.