Skip to content

Arrays Are Curves

Pillar: SHAPE — "Slope = difference of consecutive elements. Shape = direction of those differences."


Slope = Difference of Consecutive Elements

Given an array \(a_0, a_1, \ldots, a_{n-1}\), define the difference array \(d\) where:

\[d_i = a_{i+1} - a_i \quad \text{for } i = 0, 1, \ldots, n-2\]

If you think of \(a\) as a function \(f(i) = a_i\) plotted on integer points, then \(d_i\) is the discrete derivative — the slope between consecutive points.

Is the difference array really the same as a derivative?

Almost. The continuous derivative \(f'(x) = \lim_{h \to 0} \frac{f(x+h) - f(x)}{h}\) uses infinitesimal steps. The difference array uses step size \(h = 1\), so it's the forward finite difference \(\Delta f(i) = f(i+1) - f(i)\). The analogy is exact: summing the difference array recovers the original (up to a constant), just like integrating a derivative. Second differences are the discrete second derivative. All the calculus intuition transfers.

Compute it for \(a = [2, 5, 9, 14]\):

\(i\) \(a_i\) \(d_i = a_{i+1} - a_i\)
0 2 \(3\)
1 5 \(4\)
2 9 \(5\)
3 14

Differences: \([3, 4, 5]\). All positive — the curve goes up. The differences are increasing (\(3 < 4 < 5\)) — the curve is getting steeper, meaning it's convex.

Now \(a = [10, 7, 5, 4]\):

Differences: \([-3, -2, -1]\). All negative — the curve goes down.

Wait — the differences are increasing (-3 < -2 < -1), so is this convex even though it's going down?

Yes. Convexity is about the slopes increasing, not the values increasing. The slopes \(-3, -2, -1\) are getting less negative — they're increasing. So this is a convex decreasing curve (like the right half of a bowl). A concave decreasing curve would have slopes that get more negative.


The Four Shapes

Shape Condition on \(d\) Condition on second differences
Strictly increasing All \(d_i > 0\)
Strictly decreasing All \(d_i < 0\)
Convex (slopes increase) \(d_{i+1} > d_i\) for all \(i\) Second derivative \(> 0\)
Concave (slopes decrease) \(d_{i+1} < d_i\) for all \(i\) Second derivative \(< 0\)

A unimodal array increases then decreases — it has a single peak. The slopes start positive and end negative.


Why Shape Determines Algorithm

This is the punch line. The shape of your data tells you which algorithm works:

Monotonic \(\rightarrow\) binary search. If \(a\) is sorted, you can binary search for any target in \(O(\log n)\).

Unimodal \(\rightarrow\) peak finding in \(O(\log n)\). Compare \(a[\text{mid}]\) with \(a[\text{mid}+1]\). If the slope at mid is positive, the peak is to the right. If negative, the peak is to the left.

Convex \(\rightarrow\) ternary search or convex hull trick. If the cost function is convex, ternary search finds the minimum in \(O(\log n)\) evaluations.

Concave \(\rightarrow\) WQS Binary Search (Alien Trick).

Can binary search work on a non-monotonic function?

No — binary search relies on the guarantee that if \(f(\text{mid})\) is too small, the answer must be to the right. A non-monotonic function breaks this guarantee: the answer could be on either side. Similarly, ternary search on a non-convex function can miss the global minimum because it might eliminate a region containing it. Detecting shape first is not optional — it's the prerequisite.

Here's a concrete example. The stock profit problem: you can make at most \(k\) transactions. Plot max profit as a function of \(k\):

\(k\) Max profit Marginal gain (slope)
0 0
1 50 50
2 80 30
3 95 15
4 100 5

Slopes: \([50, 30, 15, 5]\) — strictly decreasing. That's a concave curve. Each extra transaction gives diminishing returns. Once you see this concavity, you know WQS Binary Search can optimize the DP.


Slope Normalization — When Slopes Are Fractions

In arrays, the \(x\)-gap between consecutive elements is always 1, so slope = \(\Delta y\). But when you work with 2D coordinates, slopes become fractions: \(\Delta y / \Delta x\).

The problem: floating-point comparison is unreliable. Is 0.333333 equal to 1/3? Maybe. Maybe not.

Can't we just compare with a small epsilon, like abs(a - b) < 1e-9?

For single comparisons, yes. But when you're hashing slopes (like in "Max Points on a Line"), epsilon comparison breaks hash maps entirely — two "equal" slopes can land in different buckets. The only reliable approach for exact comparison is to normalize to a canonical integer representation.

The fix: reduce \((\Delta y, \Delta x)\) by their GCD and normalize the sign so the denominator is always positive.

pair<int, int> normalizeSlope(int deltaY, int deltaX) {
    if (deltaX == 0) return {1, 0};
    if (deltaY == 0) return {0, 1};
    int sign = (deltaX < 0) ? -1 : 1;
    int divisor = __gcd(abs(deltaY), abs(deltaX));
    return {sign * deltaY / divisor, sign * deltaX / divisor};
}

This is exactly how "Max Points on a Line" works: for each point, compute normalized slopes to every other point, group by slope in a hash map, and the largest group is the answer.


The Predicate Form — Binary Search Without a Sorted Array

Binary search doesn't need a physical sorted array. It needs a monotonic predicate.

Define $f(x) = $ "is it possible to achieve answer \(\leq x\)?" If this function is monotonic — false, false, ..., false, true, true, ..., true — then binary search finds the exact threshold.

How do I know if the predicate is monotonic without proving it formally?

Ask: "if I can achieve the goal with budget \(x\), can I also achieve it with budget \(x + 1\)?" If the answer is always yes (more budget never hurts), the predicate is monotonic. This works for "minimize the maximum" (more slack = easier), "maximize the minimum" (larger threshold = harder, so flip the logic), and many similar problems.

Example: split an array into \(k\) subarrays such that the maximum subarray sum is minimized. Let $f(x) = $ "can we split into \(\leq k\) subarrays where every sum is \(\leq x\)?"

Candidate max \(x\) Feasible?
10 false
15 false
18 true
20 true
32 true

The pattern: false, false, true, true, true. Monotonic. Binary search on \(x\) finds the minimum feasible maximum.


Function Addition Rules

When you add two functions, the shape of the sum depends on the shapes of the parts:

Sum Result
Convex + convex Convex
Concave + concave Concave
Convex + concave Unknown
Linear + anything Preserves the other's shape
Why does convex + convex = convex? Can we see this without calculus?

Think about slopes. If function \(f\) has increasing slopes and function \(g\) has increasing slopes, then \(f + g\) has slopes that are the sum of \(f\)'s slopes and \(g\)'s slopes. The sum of two increasing sequences is increasing. So the slopes of \(f + g\) are increasing — which is the definition of convex. Same logic works for concave (decreasing + decreasing = decreasing). But increasing + decreasing could go either way.

Why this matters: in DP, your total cost is often the sum of component costs. If every component is convex, the total is convex — and you can apply convex hull trick or slope trick. If one component is concave and another is convex, those optimizations break.


Optimization Signals

Shape What it enables
Monotonic Binary search on sorted data or on the answer
Unimodal Peak finding in \(O(\log n)\)
Convex Ternary search, Convex Hull Trick, Slope Trick
Concave WQS Binary Search (Alien Trick), Divide & Conquer optimization

These are advanced topics covered later, but you should know the signals now. When you encounter a DP recurrence and the cost function is convex, that's a flag that says "this can be optimized."


Shape Detection Code

string detectShape(vector<int>& values) {
    int length = (int)values.size();
    if (length <= 1) return "trivial";

    bool allPositive = true, allNegative = true;
    bool slopesIncrease = true, slopesDecrease = true;

    vector<int> differences(length - 1);
    for (int i = 0; i < length - 1; i++) {
        differences[i] = values[i + 1] - values[i];
        if (differences[i] <= 0) allPositive = false;
        if (differences[i] >= 0) allNegative = false;
    }

    for (int i = 0; i < (int)differences.size() - 1; i++) {
        if (differences[i + 1] <= differences[i]) slopesIncrease = false;
        if (differences[i + 1] >= differences[i]) slopesDecrease = false;
    }

    if (allPositive) return "strictly increasing";
    if (allNegative) return "strictly decreasing";
    if (slopesIncrease) return "convex";
    if (slopesDecrease) return "concave";
    return "none of the standard shapes";
}

Quick Recap

  • The difference array is the discrete derivative. Its signs tell you the shape; shape tells you the algorithm.
  • Monotonic = binary search. Unimodal = peak finding. Convex = ternary search / CHT. Concave = WQS / D&C optimization. Wrong shape assumption = wrong algorithm.
  • Slopes as fractions need GCD normalization to avoid floating-point errors.
  • Binary search on the answer works whenever the feasibility predicate is monotonic — no sorted array required.
  • Convex + convex = convex. Concave + concave = concave. Mixed = unknown. These addition rules determine which DP optimizations are legal.

Problems

Problem Link Difficulty Technique
LC 162 — Find Peak Element leetcode.com/problems/find-peak-element Medium Unimodal shape detection
LC 149 — Max Points on a Line leetcode.com/problems/max-points-on-a-line Hard Slope normalization + hash map grouping
LC 410 — Split Array Largest Sum leetcode.com/problems/split-array-largest-sum Hard Binary search on the answer