Binary Search: Introduction and Variations
This document explores the nuances of binary search implementation, moving beyond a single "correct" template to understand why different variations work. The core idea is that binary search operates on a sorted (or monotonically increasing/decreasing) range to find a specific element or boundary.
Common Variations in Binary Search Implementation
Binary search can be written in many seemingly different ways that are all functionally correct. These variations primarily concern the loop condition, how the midpoint is calculated, how the search boundaries (l and r) are updated, and what those boundaries initially represent.
1. Loop Starting Conditions
The while loop condition defines when the search space has been narrowed down to a single point or is empty.
while(l < r): This loop continues as long as the left pointer (l) is strictly less than the right pointer (r). It's often used whenlandrdefine an inclusive-exclusive range[l, r), or when the loop is designed to terminate withl == r.while(r - l > 1)orwhile(l < r - 1): This condition ensures the loop stops when the pointers are adjacent (r - l == 1). This is a particularly robust variant, which is discussed later.while(l <= r): This is perhaps the most common template, often used with a "closed interval"[l, r]. The loop continues as long as the search space is valid (i.e.,lis not greater thanr). If the element isn't found, the loop terminates whenlbecomesr + 1.
2. Mid-point Calculation
The "middle" element divides the search space.
mid = (l + r) / 2: This is the standard "floor" calculation. In integer division, it truncates any decimal, effectively rounding down.- Potential Problem: If
landrare very large positive numbers,l + rcan overflow the maximum value of an integer, leading to an incorrectmidvalue.
- Potential Problem: If
mid = (l + r + 1) / 2or(l + r - 1) / 2 + 1: These are ways to calculate the "ceiling" (rounding up) of the midpoint. This is sometimes needed depending on the loop structure to prevent infinite loops.mid = l + (r - l) / 2: This is the preferred method to avoid the overflow bug. Sincer - lwill always be smaller thanl + r, it calculates the distance fromlto the midpoint and adds it, which is mathematically equivalent to(l + r) / 2but safe from overflow.mid = l / 2 + r / 2: This is another variation that can behave differently due to integer division (e.g.,5/2 + 1/2 = 2 + 0 = 2, whereas(5+1)/2 = 3).
3. Pointer Movement
common binary search problem is finding the first 1 in a sequence of 0s followed by 1s. The pointer movement depends on what l and r represent. (Assuming a 000...1111 predicate)
-
Two variables (
l,r), one stores the boundary:if(true) r = mid, else l = mid + 1: Here, ifmidis a1(true), we setr = mid. This is becausemidcould be the first1, so we include it in the remaining search space[l, mid]. If it's a0(false), we know the first1must be aftermid, so we setl = mid + 1.if(true) r = mid, else l = mid: This is a common source of infinite loops if not paired with the correctmidcalculation (e.g., ceiling) and loop condition (likewhile(l < r)). However, this movement is often safe when used withwhile(r - l > 1).
-
Extra
ansvariable:if(true) r = mid - 1, ans = mid else l = mid + 1: This is used in the "Common Template" (see below). Ifmidis a1(true), we store it as a potential answer (ans = mid) and then exclude it from the future search (r = mid - 1), looking for an even earlier1in the range[l, mid-1]. If it's a0(false), we know the answer must be after it (l = mid + 1).
4. Boundary Definitions.
- (Assuming array indices [0, n-1])
l = -1, r = n: Here,landrare set outside the valid array indices. They act as "sentinel" values. This is the standard setup for thewhile(r - l > 1)loop, whereltracks the last "false" andrtracks the first "true".l = 0, r = n:lis the first valid index, butris one past the last valid index. This defines an "open interval"[l, r).l = 0, r = n - 1: This is the most intuitive setup, wherelandrrepresent the start and end of the "closed interval"[l, r]of valid indices to be searched.
Invariants
An invariant is a property or condition of a loop that is true before the loop starts, at the beginning of every iteration, and at the end of every iteration. It's the "guarantee" that the loop is maintaining.
In binary search, a common invariant is: "The answer, if it exists, is always within the range defined by l and r."
- For example, in the
[l, r]closed interval, the invariant is "the first1is somewhere inA[l...r]". - When we check
A[mid]and moveltomid + 1, we have maintained the invariant because we've proven the answer cannot be inA[l...mid]. - Similarly, when we move
rtomid - 1, we maintain the invariant by excludingA[mid...r](after storingmidas a potential answer).
Understanding your loop's invariant is the key to proving its correctness and avoiding infinite loops.
Common Template Example
This template is robust for finding the first element that satisfies a condition (the first 1). It uses a closed interval [l, r] and an ans variable. (Predicate 0000...1111)
Click to expand solution code
ans = -1 // (Initialize ans to a "not found" state)
while(l <= r) // (Loop as long as the search space [l, r] is valid)
{
mid = l + (r - l) / 2 // (Preferred overflow-safe calculation)
if (condition_is_true_at_mid) // (e.g., A[mid] == 1)
{
ans = mid // (This might be the first 1, so we record it)
r = mid - 1 // (Now, we search before mid to see if an even earlier 1 exists. The new space is [l, mid-1])
}
else // (The condition is false, e.g., A[mid] == 0)
{
l = mid + 1 // (The first 1 must be after mid. The new space is [mid+1, r])
}
}
return ans != -1 ? ans : no_ans; // (Return the last ans we recorded, or a "not found" value if ans never changed)
A Note on the while(r - l > 1) Variant
This loop condition is particularly noteworthy because it avoids the most common source of infinite loops.
- The Problem: Infinite loops in binary search almost always happen when the search range has only two numbers left (e.g.,
l = 5, r = 6). Ifmidis calculated as(5+6)/2 = 5(floor), and the logic for a "false" predicate isl = mid, the range remains[5, 6], and the loop never ends. - The Solution: The condition
while(r - l > 1)quits the loop as soon as the range size becomes 2 (i.e., whenr - l == 1). The loop never runs an iteration with only two elements. - Consequence: Because the loop bypasses the problematic two-element case, it is much more "forgiving." You can often use simpler pointer movements like
l = midorr = mid(instead ofl = mid + 1orr = mid - 1) without risking an infinite loop, as long as your initial boundaries are set correctly (e.g.,l = -1, r = n). The final answer is then found atlorrafter the loop finishes.
Key Concepts for Understanding
The document stresses that relying on a single template is fragile. True understanding comes from grasping the underlying mechanics.
- Pointer Relationship:
landrare not just variables; they are the boundaries of the remaining search space. Every loop iteration must shrink this space. - Mid-point:
midis just a probe. We use it to test a point and decide which half of the remaining search space to discard. - Common Bugs:
- Overflow Bug:
(l + r)can exceed the maximum integer size. Fix:l + (r - l) / 2. - Infinite Loop: This happens when the search space
[l, r]fails to shrink. As discussed, this is most common when the range has two elements. This is why pointer movement must always excludemidfrom one side of the new range (e.g.,l = mid + 1orr = mid - 1) OR you must use a loop condition (likewhile(r - l > 1)) that avoids this case entirely.
- Overflow Bug:
- Learning Approach: The document advocates for a "small to large" approach: start by understanding the simplest cases and invariants, then build up to more complex variations, rather than memorizing a complex template from the start.
Connecting Binary Search to Monotonicity, Functions, and Slope
Binary search is not just an algorithm for arrays; it's a mathematical technique for finding a root or boundary on a function.
1. The Core Prerequisite: Monotonicity
Binary search only works because of a property called monotonicity. This means the function (or array) is consistently non-decreasing (0, 0, 0, 1, 1, 1) or non-increasing (1, 1, 1, 0, 0, 0).
- Predicate Function: Think of your search space as a function \(P(x)\) that returns either "false" (0) or "true" (1). Binary search is designed to find the first \(x\) for which \(P(x)\) is "true."
- Search Space Elimination: Monotonicity is what guarantees we can eliminate half the search space.
- If we test
midand \(P(mid)\) is "false" (0), we know the answer must be in the range(mid, r]because the function is non-decreasing. - If \(P(mid)\) is "true" (1), we know the answer must be in the range
[l, mid](it could bemid, or something earlier).
- If we test
- Without this guarantee, we'd have no idea which half to discard.
2. Visualizing with Math Graphs
- Monotonic Function: A standard binary search on a sorted array
[1, 3, 5, 7, 9]is like finding \(x\) on a graph of a monotonically increasing function \(f(x)\). The predicatef(x) >= 7creates the0, 0, 0, 1, 1pattern. - "Binary Search on the Answer": This is a powerful modification. You use it when you can't search the input directly, but you can search the answer.
- Problem: "Find the minimum speed to travel 100km in 5 hours, given \(n\) road segments with different speed limits."
- Solution: You can't binary search the road segments. But you can binary search the speed.
- Predicate: Create a function
can_I_finish(speed)that returns "true" if a given speed lets you finish in 5 hours, and "false" otherwise. - Graph: This
can_I_finish(speed)function is monotonic. A speed of 10 kph might be "false," but if 50 kph is "true," then 51 kph will also be "true." This creates the0, 0, 0, 1, 1, 1pattern we can search. We are finding the "step" on this predicate graph, which corresponds to the minimum speed.
3. When Monotonicity Fails: Unimodal Functions and Ternary Search
What if the "slope" changes? What if the function is not monotonic but unimodal—it has a single peak (like a hill, \(f(x) = -x^2\)) or a single valley (like \(f(x) = x^2\))?
- Why Binary Search Fails: Imagine you are looking for the peak of the hill. You test a
midpoint. The slope is positive (it's going up). Does this mean the peak is to the right? Or did you land on the left side of the peak, which is also going up? You can't tell. You can't eliminate half the search space. -
The Modification: Ternary Search Instead of one
midpoint, we pick two:m1 = l + (r - l) / 3m2 = r - (r - l) / 3
We compare \(f(m1)\) and \(f(m2)\). Let's assume we're finding a peak (maximum): * If \(f(m1) < f(m2)\): The function is still "going up" between \(m1\) and \(m2\). This means the peak cannot be in the range
[l, m1](it must be to the right of \(m1\)). So, we setl = m1. * If \(f(m1) > f(m2)\): The function is "going down" between \(m1\) and \(m2\). This means the peak must have happened before \(m2\). The peak cannot be in the range[m2, r]. So, we setr = m2.This process eliminates 1/3 of the search space in each step, which is still logarithmic (\(O(\log n)\)). This "modified" search (Ternary Search) is essential for optimizing on unimodal functions where the "slope" changes.