SYSTEM Cited by 1 source
OpenJDK java.util.Arrays.binarySearch¶
Definition¶
java.util.Arrays.binarySearch is the standard-library binary-search routine in the Java Development Kit (JDK / OpenJDK), authored by Joshua Bloch. It sits in java.util.Arrays with overloads for each primitive element type (int[], long[], double[], Object[] with a Comparator, etc.). Contract: for a sorted array a and key k, returns the index of k if present, or -(insertionPoint + 1) if absent.
Wiki relevance — why there's a page at all¶
For ~9 years (roughly 1997 to ~2006), the int-overload of this routine shipped with a classic concepts/integer-overflow bug in its midpoint computation:
For arrays of 2^30 elements or more, low + high can exceed INT_MAX = 2^31 − 1; the sum wraps to a negative value; division by 2 preserves the sign; a[mid] throws ArrayIndexOutOfBoundsException.
The bug was user-reported to Sun after it broke production code. Bloch's canonical retrospective (post) uses this exact JDK source as the running example — the wiki page exists specifically because the JDK shipped the bug and Bloch's post is the industry's best-known retelling.
The code, pre- and post-fix¶
Pre-fix (the bug):
public static int binarySearch(int[] a, int key) {
int low = 0;
int high = a.length - 1;
while (low <= high) {
int mid = (low + high) / 2; // ← OVERFLOW FOR a.length ≥ 2^30
int midVal = a[mid];
if (midVal < key) low = mid + 1;
else if (midVal > key) high = mid - 1;
else return mid;
}
return -(low + 1);
}
Post-fix (Bloch's recommended shapes, either works):
int mid = low + ((high - low) / 2); // Fix A — portable
int mid = (low + high) >>> 1; // Fix B — Java's unsigned right shift
See patterns/safe-midpoint-computation for the pattern and concepts/unsigned-right-shift for the language primitive behind Fix B.
Why it took ~9 years to surface¶
Two preconditions had to coincide:
- A sorted
int[]of at least2^30(≈1 billion) elements — essentially unheard-of in the late-'90s JDK deployment context; increasingly common at Google and comparable shops by the mid-2000s. binarySearchcalled withlowandhigheventually close enough toINT_MAX / 2for their sum to overflow.
Until both were true in the same JVM, the bug was invisible. See concepts/program-correctness for the broader correctness-under-tested-inputs ≠ correctness-under-all-inputs framing Bloch uses this case study for.
Related JDK surfaces¶
The same arithmetic shape afflicted — and had to be separately fixed in — every divide-and-conquer algorithm in the JDK that averaged two indices:
java.util.Arrays.sort(mergesort / TimSort internals)java.util.Collections.binarySearch- Primitive sort routines for
long[],double[], etc. - Any user code copying the
(low + high) / 2idiom from Programming Pearls or the JDK prior to the fix.
Bloch's post is a call to audit the whole family — not just to patch binarySearch.
Source provenance note¶
The exact JDK commit / bug-ID / version where the fix landed is not cited in the post; the wiki page does not fabricate it. The fix is in the Programming Pearls 2nd-edition (2000) errata and in JDK versions post-report (~2006-era, JDK 6 timeframe). What the wiki captures from primary evidence is Bloch's narrative: he wrote the broken code; a user hit it; it was fixed.
Seen in¶
- sources/2025-01-11-google-nearly-all-binary-searches-and-mergesorts-are-broken-2006 — canonical: the JDK
Arrays.binarySearchis the running example for the whole post; the bug's ~9-year latency inside Sun's codebase is the central narrative hook.