PATTERN Cited by 1 source
Safe midpoint computation¶
Pattern¶
When computing the midpoint of two indices low and high (both
non-negative) in a divide-and-conquer algorithm, do not write
(low + high) / 2 — the sum can overflow the fixed-width integer
type for sufficiently large arrays. Use either of these shapes
instead:
int mid = low + ((high - low) / 2); // Fix A: never forms the sum
int mid = (low + high) >>> 1; // Fix B (Java): logical/unsigned right shift
Both compute the mathematical average of low and high (truncated
down toward zero) for all 0 ≤ low ≤ high ≤ INT_MAX. The naive
(low + high) / 2 fails once low + high > INT_MAX, which occurs
for arrays of roughly 2^30 (≈1 billion) elements or more.
The hazard¶
From Bloch, 2006:
On the face of it, this assertion might appear correct, but it fails for large values of the
intvariableslowandhigh. Specifically, it fails if the sum oflowandhighis greater than the maximum positiveintvalue (2^31 − 1). The sum overflows to a negative value, and the value stays negative when divided by two. In C this causes an array index out of bounds with unpredictable results. In Java, it throwsArrayIndexOutOfBoundsException.
The bug lay dormant in Programming Pearls (Bentley, 1986 / 2000)
for two decades and in java.util.Arrays.binarySearch — which Bloch himself authored — for about nine years, because arrays that large didn't exist when the code was written. See concepts/integer-overflow for the underlying substrate.
Why Fix A works¶
mid = low + ((high - low) / 2):
low ≤ high, sohigh - low ≥ 0— no underflow.high - low ≤ INT_MAX(becausehigh ≤ INT_MAXandlow ≥ 0) — no overflow.(high - low) / 2 ≤ (INT_MAX / 2)— no overflow.low + (something ≤ INT_MAX/2) ≤ INT_MAX— no overflow.
Every intermediate value stays in range. Language-agnostic and portable.
Why Fix B works¶
mid = (low + high) >>> 1 relies on the interaction between two's-complement wraparound and logical right shift:
low + highcan overflow signedINT_MAX; in two's-complement the high bit is set and interpretation as signed gives a negative number.- Interpreted as unsigned, the same bit pattern is the true arithmetic sum
(mod 2^32), which for non-negative inputs summing in[0, 2 · INT_MAX]is exactly the mathematical sum. - Logical right shift by 1 halves the unsigned view → correct midpoint.
Likely one machine instruction — Bloch notes it's "probably faster, and arguably as clear" as Fix A.
The C99 undefined-behaviour footnote¶
Bloch's 2008 update (from Antoine Trux, Nokia Research) warns:
The older C Standard, C89/90, and the C++ Standard are both identical to C99 in this respect. … if you add two signed quantities and get an overflow, the result is undefined.
So the C/C++ shape must cast to unsigned before adding:
A signed-addition-then-logical-shift shape is "works on most compilers" but not guaranteed. See concepts/memory-safety for why "undefined behaviour" is not merely a theoretical concern — C compilers are allowed to optimize based on the assumption that signed overflow does not occur, which can delete surrounding code.
Generalisation: every divide-and-conquer over indices¶
The same shape bug afflicts every algorithm that averages two
indices via (low + high) / 2:
- Binary search — the canonical case.
- Mergesort — recursive midpoint split.
- Quicksort partitioning — median-of-three and midpoint-pivot variants.
- Heap operations — parent/child index arithmetic in some implementations.
- Segment-tree / Fenwick-tree range queries.
- Radix-tree, B-tree, and other index-structure internal nodes with key-range midpoints.
Bloch: "The binary-search bug applies equally to mergesort, and to other divide-and-conquer algorithms. If you have any code that implements one of these algorithms, fix it now before it blows up."
Modern-language variants¶
- Rust — the typed response is
usize::midpoint(a, b)(stable since Rust 1.85) or(a + b) / 2underusize(unsigned — the same modular recovery as Java's>>>).checked_add/wrapping_add/saturating_addmake the invariant a compile-time choice. - Python — integers are arbitrary-precision; the naive formula cannot overflow. The pattern is vacuous here.
- Go —
intis typically 64-bit on modern targets, making overflow astronomically improbable but not impossible; idiomatic code often still useslow + (high-low)/2out of habit. - JavaScript — historically 32-bit-signed bitwise operations on
Number; the>>>operator exists specifically for this class of problem. Post-BigInt the hazard is sharper onNumber, safer onBigInt.
Pattern discipline¶
Use patterns/invariant-driven-programming: the invariant you want is "mid is the mathematical midpoint of low and high, truncated down", and the substrate that threatens it is "fixed-width signed integer arithmetic". Write the invariant; check it against the substrate; pick the idiom (Fix A or Fix B) that makes the invariant hold unconditionally.
Seen in¶
- sources/2025-01-11-google-nearly-all-binary-searches-and-mergesorts-are-broken-2006 — the canonical Bloch reference; both fix shapes introduced; C/C++ portable form nailed down in the 2008 Trux update; generalisation to mergesort and all divide-and-conquer stated explicitly.