Binary Search: Introduction and Variations

This document explores the nuances of binary search implementation, moving beyond a single "correct" template to understand why different variations work. The core idea is that binary search operates on a sorted (or monotonically increasing/decreasing) range to find a specific element or boundary.

Common Variations in Binary Search Implementation

Binary search can be written in many seemingly different ways that are all functionally correct. These variations primarily concern the loop condition, how the midpoint is calculated, how the search boundaries (l and r) are updated, and what those boundaries initially represent.

1. Loop Starting Conditions

The while loop condition defines when the search space has been narrowed down to a single point or is empty.

while(l < r): This loop continues as long as the left pointer (l) is strictly less than the right pointer (r). It's often used when l and r define an inclusive-exclusive range [l, r), or when the loop is designed to terminate with l == r.
while(r - l > 1) or while(l < r - 1): This condition ensures the loop stops when the pointers are adjacent (r - l == 1). This is a particularly robust variant, which is discussed later.
while(l <= r): This is perhaps the most common template, often used with a "closed interval" [l, r]. The loop continues as long as the search space is valid (i.e., l is not greater than r). If the element isn't found, the loop terminates when l becomes r + 1.

2. Mid-point Calculation

The "middle" element divides the search space.

mid = (l + r) / 2: This is the standard "floor" calculation. In integer division, it truncates any decimal, effectively rounding down.
- Potential Problem: If l and r are very large positive numbers, l + r can overflow the maximum value of an integer, leading to an incorrect mid value.
mid = (l + r + 1) / 2 or (l + r - 1) / 2 + 1: These are ways to calculate the "ceiling" (rounding up) of the midpoint. This is sometimes needed depending on the loop structure to prevent infinite loops.
mid = l + (r - l) / 2: This is the preferred method to avoid the overflow bug. Since r - l will always be smaller than l + r, it calculates the distance from l to the midpoint and adds it, which is mathematically equivalent to (l + r) / 2 but safe from overflow.
mid = l / 2 + r / 2: This is another variation that can behave differently due to integer division (e.g., 5/2 + 1/2 = 2 + 0 = 2, whereas (5+1)/2 = 3).

3. Pointer Movement

common binary search problem is finding the first 1 in a sequence of 0s followed by 1s. The pointer movement depends on what l and r represent. (Assuming a 000...1111 predicate)

Two variables (l, r), one stores the boundary:
- if(true) r = mid, else l = mid + 1: Here, if mid is a 1 (true), we set r = mid. This is because mid could be the first 1, so we include it in the remaining search space [l, mid]. If it's a 0 (false), we know the first 1 must be after mid, so we set l = mid + 1.
- if(true) r = mid, else l = mid: This is a common source of infinite loops if not paired with the correct mid calculation (e.g., ceiling) and loop condition (like while(l < r)). However, this movement is often safe when used with while(r - l > 1).
Extra ans variable:
- if(true) r = mid - 1, ans = mid else l = mid + 1: This is used in the "Common Template" (see below). If mid is a 1 (true), we store it as a potential answer (ans = mid) and then exclude it from the future search (r = mid - 1), looking for an even earlier 1 in the range [l, mid-1]. If it's a 0 (false), we know the answer must be after it (l = mid + 1).

4. Boundary Definitions.

(Assuming array indices [0, n-1])
l = -1, r = n: Here, l and r are set outside the valid array indices. They act as "sentinel" values. This is the standard setup for the while(r - l > 1) loop, where l tracks the last "false" and r tracks the first "true".
l = 0, r = n: l is the first valid index, but r is one past the last valid index. This defines an "open interval" [l, r).
l = 0, r = n - 1: This is the most intuitive setup, where l and r represent the start and end of the "closed interval" [l, r] of valid indices to be searched.

Invariants

An invariant is a property or condition of a loop that is true before the loop starts, at the beginning of every iteration, and at the end of every iteration. It's the "guarantee" that the loop is maintaining.

In binary search, a common invariant is: "The answer, if it exists, is always within the range defined by l and r."

For example, in the [l, r] closed interval, the invariant is "the first 1 is somewhere in A[l...r]".
When we check A[mid] and move l to mid + 1, we have maintained the invariant because we've proven the answer cannot be in A[l...mid].
Similarly, when we move r to mid - 1, we maintain the invariant by excluding A[mid...r] (after storing mid as a potential answer).

Understanding your loop's invariant is the key to proving its correctness and avoiding infinite loops.

Common Template Example

This template is robust for finding the first element that satisfies a condition (the first 1). It uses a closed interval [l, r] and an ans variable. (Predicate 0000...1111)

Click to expand solution code

ans = -1 // (Initialize ans to a "not found" state)

while(l <= r) // (Loop as long as the search space [l, r] is valid)
{
    mid = l + (r - l) / 2 // (Preferred overflow-safe calculation)

    if (condition_is_true_at_mid) // (e.g., A[mid] == 1)
    {
        ans = mid     // (This might be the first 1, so we record it)
        r = mid - 1 // (Now, we search before mid to see if an even earlier 1 exists. The new space is [l, mid-1])
    }
    else // (The condition is false, e.g., A[mid] == 0)
    {
        l = mid + 1 // (The first 1 must be after mid. The new space is [mid+1, r])
    }
}
return ans != -1 ? ans : no_ans; // (Return the last ans we recorded, or a "not found" value if ans never changed)

A Note on the `while(r - l > 1)` Variant

This loop condition is particularly noteworthy because it avoids the most common source of infinite loops.

The Problem: Infinite loops in binary search almost always happen when the search range has only two numbers left (e.g., l = 5, r = 6). If mid is calculated as (5+6)/2 = 5 (floor), and the logic for a "false" predicate is l = mid, the range remains [5, 6], and the loop never ends.
The Solution: The condition while(r - l > 1) quits the loop as soon as the range size becomes 2 (i.e., when r - l == 1). The loop never runs an iteration with only two elements.
Consequence: Because the loop bypasses the problematic two-element case, it is much more "forgiving." You can often use simpler pointer movements like l = mid or r = mid (instead of l = mid + 1 or r = mid - 1) without risking an infinite loop, as long as your initial boundaries are set correctly (e.g., l = -1, r = n). The final answer is then found at l or r after the loop finishes.

Key Concepts for Understanding

The document stresses that relying on a single template is fragile. True understanding comes from grasping the underlying mechanics.

Pointer Relationship: l and r are not just variables; they are the boundaries of the remaining search space. Every loop iteration must shrink this space.
Mid-point: mid is just a probe. We use it to test a point and decide which half of the remaining search space to discard.
Common Bugs:
1. Overflow Bug: (l + r) can exceed the maximum integer size. Fix: l + (r - l) / 2.
2. Infinite Loop: This happens when the search space [l, r] fails to shrink. As discussed, this is most common when the range has two elements. This is why pointer movement must always exclude mid from one side of the new range (e.g., l = mid + 1 or r = mid - 1) OR you must use a loop condition (like while(r - l > 1)) that avoids this case entirely.
Learning Approach: The document advocates for a "small to large" approach: start by understanding the simplest cases and invariants, then build up to more complex variations, rather than memorizing a complex template from the start.

Connecting Binary Search to Monotonicity, Functions, and Slope

Binary search is not just an algorithm for arrays; it's a mathematical technique for finding a root or boundary on a function.

1. The Core Prerequisite: Monotonicity

Binary search only works because of a property called monotonicity. This means the function (or array) is consistently non-decreasing (0, 0, 0, 1, 1, 1) or non-increasing (1, 1, 1, 0, 0, 0).

Predicate Function: Think of your search space as a function \(P(x)\) that returns either "false" (0) or "true" (1). Binary search is designed to find the first \(x\) for which \(P(x)\) is "true."
Search Space Elimination: Monotonicity is what guarantees we can eliminate half the search space.
- If we test mid and \(P(mid)\) is "false" (0), we know the answer must be in the range (mid, r] because the function is non-decreasing.
- If \(P(mid)\) is "true" (1), we know the answer must be in the range [l, mid] (it could be mid, or something earlier).
Without this guarantee, we'd have no idea which half to discard.

2. Visualizing with Math Graphs

Monotonic Function: A standard binary search on a sorted array [1, 3, 5, 7, 9] is like finding \(x\) on a graph of a monotonically increasing function \(f(x)\). The predicate f(x) >= 7 creates the 0, 0, 0, 1, 1 pattern.
"Binary Search on the Answer": This is a powerful modification. You use it when you can't search the input directly, but you can search the answer.
- Problem: "Find the minimum speed to travel 100km in 5 hours, given \(n\) road segments with different speed limits."
- Solution: You can't binary search the road segments. But you can binary search the speed.
- Predicate: Create a function can_I_finish(speed) that returns "true" if a given speed lets you finish in 5 hours, and "false" otherwise.
- Graph: This can_I_finish(speed) function is monotonic. A speed of 10 kph might be "false," but if 50 kph is "true," then 51 kph will also be "true." This creates the 0, 0, 0, 1, 1, 1 pattern we can search. We are finding the "step" on this predicate graph, which corresponds to the minimum speed.

3. When Monotonicity Fails: Unimodal Functions and Ternary Search

What if the "slope" changes? What if the function is not monotonic but unimodal—it has a single peak (like a hill, \(f(x) = -x^2\)) or a single valley (like \(f(x) = x^2\))?

Why Binary Search Fails: Imagine you are looking for the peak of the hill. You test a mid point. The slope is positive (it's going up). Does this mean the peak is to the right? Or did you land on the left side of the peak, which is also going up? You can't tell. You can't eliminate half the search space.
The Modification: Ternary Search Instead of one mid point, we pick two:
- m1 = l + (r - l) / 3
- m2 = r - (r - l) / 3
We compare \(f(m1)\) and \(f(m2)\). Let's assume we're finding a peak (maximum): * If \(f(m1) < f(m2)\): The function is still "going up" between \(m1\) and \(m2\). This means the peak cannot be in the range [l, m1] (it must be to the right of \(m1\)). So, we set l = m1. * If \(f(m1) > f(m2)\): The function is "going down" between \(m1\) and \(m2\). This means the peak must have happened before \(m2\). The peak cannot be in the range [m2, r]. So, we set r = m2.

This process eliminates 1/3 of the search space in each step, which is still logarithmic (\(O(\log n)\)). This "modified" search (Ternary Search) is essential for optimizing on unimodal functions where the "slope" changes.