Skip to content

CONCEPT Cited by 1 source

Non-Maximum Suppression (NMS)

Definition

Non-Maximum Suppression (NMS) is the classical object- detection post-processing technique that de-duplicates overlapping bounding-box detections by keeping the highest-confidence box in each overlapping cluster and discarding all others whose IoU with the kept box exceeds a threshold. NMS has been the default output stage in essentially every mainstream object detector (R-CNN family, YOLO, SSD, DETR variants) for a decade.

Mechanism

The classical greedy recipe:

  1. Sort all candidate boxes by confidence, highest first.
  2. Pick the highest-confidence box; add it to the output set.
  3. Remove (suppress) every remaining box whose IoU with the picked box exceeds the threshold (typically 0.5).
  4. Repeat from step 2 until the candidate list is empty.

Result: a de-duplicated set of non-overlapping boxes.

Why it matters (and why it's increasingly contested)

NMS is the baseline everyone knows. Its virtues: simple, fast, deterministic, O(N log N). Its weakness: it discards information.

When multiple detectors — or multiple passes of the same detector — produce correct-but-imprecise boxes for the same object, NMS keeps exactly one box and throws the rest away. If the highest-confidence box has slightly wrong coordinates but a lower-confidence box had better coordinates, NMS picks the wrong-coordinate one.

This is the motivating failure that Weighted Boxes Fusion was designed to fix. Instacart's flyer-digitization team explicitly frames the comparison:

"Unlike traditional Non-Maximum Suppression (NMS), which may discard valuable information by eliminating lower-confidence boxes, WBF combines all overlapping boxes by computing a confidence-weighted average of their coordinates." (Source: sources/2026-02-09-instacart-from-print-to-digital-making-weekly-flyers-shoppable)

When to still use NMS

  • Single-detector, high-signal regime. If one detector is well-calibrated and dominates the others, NMS is fine — there's no low-confidence coordinate information worth preserving.
  • Latency-critical inference. NMS is faster than WBF.
  • When genuinely distinct nearby objects might merge. WBF's averaging can incorrectly fuse two real adjacent objects; NMS is less aggressive about blending.

Seen in

Last updated · 319 distilled / 1,201 read