Four Moves, Every Problem

Pillar: All Four — This lesson introduces Fix, Set, Shape, and Chain — the four thinking moves that frame every lesson in this course.

The Bottleneck Isn't Math — It's Panic

You open a problem. There's a formula. Maybe a summation, maybe a modular equation, maybe a recurrence. Your brain does that thing where it tries to see the whole solution at once, fails, and freezes.

This isn't a knowledge gap. You probably know modular arithmetic, prime factorization, and basic combinatorics. The gap is strategic: you don't have a systematic first move when a problem looks unfamiliar.

This course fixes that. Every mathematical technique you'll learn here — sieve, modular inverse, CRT, matrix exponentiation, generating functions — is taught through one of four thinking moves. The move tells you what question to ask. The question tells you what to try. The technique is just the answer to that question.

Four moves. Every problem. No freezing.

Why Four?

These aren't arbitrary categories. They correspond to four fundamentally different things you can do when you're stuck:

Reduce variables (Fix) — pin something down so there's less to solve
Change representation (Set) — see the same object as a collection, which unlocks new operations
Exploit geometry (Shape) — use the function's structure to eliminate large chunks of the search space
Build incrementally (Chain) — compute the answer for size \(n\) from the answer for size \(n-1\)

Every algorithm you'll learn in this course does one of these four things. Sometimes two at once. The moves give you a vocabulary for what you're actually doing when you solve a problem — and more importantly, what to try when you're not solving one.

Move 1: Fix

The question: "If I hold \(X\) still, what must \(Y\) be?"

Fixing is the art of eliminating degrees of freedom. When a problem has multiple unknowns, you pick one, nail it to a specific value, and watch the rest of the system collapse into something solvable. You're not guessing — you're turning a multi-variable search into a single-variable lookup.

This is the move behind modular inverses ("fix the modulus, solve for \(x\)"), Bezout's identity ("fix \(\gcd(a,b)\), express it as a linear combination"), and Chinese Remainder Theorem ("fix one congruence at a time"). Anywhere you see "for each \(X\), compute the required \(Y\)" — that's Fix.

Example: Two Sum

Given an array and a target \(T\), find two numbers that sum to \(T\).

The brute force checks every pair — \(O(N^2)\). But watch what happens when you fix one variable.

At index \(i\), look at value \(x\). Hold it still. If \(x\) is one of the two numbers, its partner must be:

\[y = T - x\]

No search needed — the partner is forced. You just need to check if you've seen it.

Trace it on \([2, 7, 11, 15]\) with \(T = 9\):

Step	\(x\)	Partner \(= T - x\)	Seen so far	Found?
1	2	7	\(\{\}\)	No. Add 2.
2	7	2	\(\{2\}\)	Yes. Answer: \((2, 7)\).

Two steps. One pass. The key was fixing \(x\) and deducing \(y\).

unordered_set<int> seen;
for (int i = 0; i < numbers.size(); i++) {
    int partner = target - numbers[i];
    if (seen.count(partner)) {
        cout << partner << " " << numbers[i] << "\n";
    }
    seen.insert(numbers[i]);
}

The hash set isn't a trick you memorize — it's the answer to question 3 ("What remembers what I've seen?"). The entire solution is a consequence of fixing one variable and calculating what the other must be.

The Four Questions of Fix

When you're stuck, run through these in order:

1. What can I fix? Look at the variables in the problem — an index, an element, a remainder, an edge. Which one, if held still, simplifies things? In Two Sum, you fix the current element. In a tree problem, you might fix an edge. In a modular arithmetic problem, you fix the modulus. The right choice usually jumps out: it's the variable that appears in both sides of the constraint.

2. What does fixing it force? Once you fix one thing, what must the remaining variable(s) be? Can you write an equation? In Two Sum, fixing \(x\) forces \(y = T - x\). In Bezout's identity, fixing \(\gcd(a,b) = d\) forces \(d = ax + by\) for some integers \(x, y\). The forcing is what turns a search into a calculation.

3. What remembers what I've seen? You've fixed something, now you need to check or count something. What data structure lets you do that fast? A hash map for \(O(1)\) lookups. A sorted set for \(O(\log N)\) range queries. A prefix array for \(O(1)\) range sums. The data structure isn't the insight — it's the bookkeeping that makes the insight efficient.

4. What's the complexity now? Fixing typically turns an \(O(N^2)\) "check all pairs" into \(O(N)\) or \(O(N \log N)\) "for each element, do a fast lookup." If you fixed something and the complexity didn't drop, you probably fixed the wrong variable. Try a different one.

These four questions are your starting protocol for Fix problems. Not "what algorithm should I use?" but "what should I fix?"

Fix Applies Beyond Pairs

Two Sum fixes a value and deduces another value. But that's just one flavor. Here's the broader picture:

What you fix	What you deduce	Example
A value \(x\)	Its required partner \(T - x\)	Two Sum
A prefix sum's remainder	Whether a valid subarray exists	Subarray Sum Divisible by \(K\)
An edge in a tree	How many paths cross it	Tree path counting
A connected component	How many nodes are outside it	Count Unreachable Pairs

The full examples come in later chapters. The thinking pattern is identical every time: fix something, deduce the rest.

The table is worth memorizing — not the specific problems, but the variety of things you can fix. Most students only think of fixing array elements. The real power is fixing structural properties: a remainder, an edge, a component. In this course specifically:

Ch2-3 (Modular Arithmetic, GCD): Fix the modulus, fix \(\gcd(a,b)\), fix one congruence at a time
Ch5 (Inclusion-Exclusion): Fix a subset of constraints and count what satisfies them
Ch8 (Recurrences): Fix the transition matrix and exponentiate

Same move, wildly different settings.

Move 2: Set

The question: "What am I collecting? What operations combine them?"

That question — "What am I collecting?" — is your entry point. The moment you can answer it, the problem transforms from abstract to mechanical.

Set thinking means representing objects as collections — multisets, bitmasks, bags of prime factors — and defining operations on those collections. Once you see a number as a set of its prime factors, problems like GCD, LCM, and divisibility become set intersection, union, and containment.

This is the move behind inclusion-exclusion ("add and subtract overlapping sets"), bitmask DP ("the state is which elements you've used"), and Euler's totient ("count integers whose prime-factor set doesn't overlap with \(n\)'s").

Example: GCD via Prime Factorization

\(60 = 2^2 \times 3 \times 5\), so its prime-factor multiset is \(\{2^2, 3^1, 5^1\}\).

\(48 = 2^4 \times 3\), so its multiset is \(\{2^4, 3^1\}\).

\(\gcd(60, 48)\) = intersect the multisets, taking the minimum exponent of each prime:

\[\{2^{\min(2,4)}, 3^{\min(1,1)}\} = \{2^2, 3^1\} = 12\]

GCD is set intersection. LCM is set union (take the max exponent). Once you see it this way, problems involving divisors and multiples become mechanical.

Why does this matter? Because set operations compose. Need \(\gcd(a, b, c)\)? Intersect all three multisets. Need to check if \(a\) divides \(b\)? Check if \(a\)'s multiset is a subset of \(b\)'s (every exponent in \(a\) is \(\leq\) the corresponding exponent in \(b\)). Need the number of divisors of \(n\)? It's the product \((e_1 + 1)(e_2 + 1) \cdots (e_k + 1)\) where \(e_i\) are the exponents — you're counting how many sub-multisets exist. The set lens turns all of these into the same operation.

The Connection to DP

Here's something worth noting early: in bitmask DP, the "state" is literally what you've collected so far. A bitmask \(\texttt{0b10110}\) means "I've used elements 1, 2, and 4." The transition asks: "If I add element \(j\) to this collection, what's the new cost?"

That's Set thinking applied to state design. The state is a set. The transition is a set operation. The full treatment comes in L2, but keep this in mind — whenever you see "which subset of items have I used?", you're in Set territory.

Move 3: Shape

The question: "Is this monotonic? Convex? Staircase?"

Shape thinking means looking at a function's geometry before touching its algebra. The geometry tells you which algorithm works. No shape analysis, no algorithm selection.

The Four Shapes

Every function you'll encounter in this course falls into one of four geometric categories:

Increasing — \(f(x)\) goes up as \(x\) grows. Binary search works.
Decreasing — \(f(x)\) goes down as \(x\) grows. Binary search works (just flip the comparison).
Convex (bowl) — \(f(x)\) decreases then increases. Has a unique minimum. Ternary search works.
Concave (hill) — \(f(x)\) increases then decreases. Has a unique maximum. Ternary search works.

Shape tells you which algorithm works — the full treatment comes in L4.

Example: Binary Search Works Because the Function Is Monotonic

"Is there a value \(\geq x\) in this sorted array?" The answer function looks like:

\[f(x) = \text{false, false, false, true, true, true}\]

It flips from false to true exactly once — it's monotonic. That single property is why binary search works. No monotonicity, no binary search. Every time you apply binary search, you're implicitly proving a shape property.

This extends to "binary search on the answer" problems. If the feasibility function — "can I achieve cost \(\leq C\)?" — is monotonic in \(C\), you can binary search on \(C\). If the cost function is convex in \(C\), you can ternary search. The algorithm follows from the shape.

There's also the staircase shape: \(\lfloor n/k \rfloor\) as \(k\) varies. The function isn't smooth — it drops in steps, and it takes only \(O(\sqrt{n})\) distinct values. That single shape observation is why harmonic sum problems (\(\sum_{k=1}^{n} \lfloor n/k \rfloor\)) don't need \(O(n)\) time. You process each "step" of the staircase at once.

The discipline of Shape is: name the geometry before you touch the algebra. If you can't say "this function is increasing" or "this function is convex," you're not ready to pick an algorithm.

Move 4: Chain

The question: "What depends on what? What's the transition?"

Chain thinking means finding the dependency graph: state \(S_n\) depends on states \(S_{n-1}\), \(S_{n-2}\), etc. Once you write the transition, the rest is mechanical — iterate, memoize, or exponentiate.

This is the move behind recurrences ("\(F(n) = F(n-1) + F(n-2)\)"), matrix exponentiation ("turn a linear recurrence into a matrix and raise it to the \(n\)-th power"), DP on trees ("each node depends on its children"), and Sprague-Grundy theory ("each game state depends on the states it can reach").

Example: Fibonacci

\(F(n) = F(n-1) + F(n-2)\), with \(F(0) = 0\), \(F(1) = 1\).

Every value depends on the two before it. That's a chain. Write the dependency, fill in order:

vector<long long> fib(limit + 1);
fib[0] = 0;
fib[1] = 1;
for (int i = 2; i <= limit; i++) {
    fib[i] = fib[i - 1] + fib[i - 2];
}

Trace it:

\(i\)	\(fib[i-2]\)	\(fib[i-1]\)	\(fib[i]\)
2	0	1	1
3	1	1	2
4	1	2	3
5	2	3	5

Each link needs only the two previous links — not the entire history. The chain carries forward just enough information for the next step. That compression is why DP is efficient.

The Four Chain Types

Not all chains are the same. The arrow between links means different things in different contexts:

Logical chains — "if \(X\) then \(Y\)." One decision forces the next. Color a graph node red, and every neighbor must be blue. Every neighbor of those must be red. One choice cascades through the entire connected component. If a contradiction appears (a node forced to be both colors), the structure is invalid. Bipartite coloring is the classic example, but the pattern shows up anywhere: 2-SAT, difference constraints, and even Two Sum (fixing one value logically forces the partner).

Structural chains — elements maintain an ordering. A monotonic stack is a decreasing chain of values. When a new element arrives that's larger than the top, it "breaks" the chain — you pop elements until the ordering is restored. Every element gets pushed once and popped once: \(O(N)\) total. Sorted arrays, Cartesian trees, and convex hull maintenance are all structural chains. The key property: the structure has an invariant, and you restore it whenever it's violated.

Transition chains — the next state is computed from previous states. This is DP. \(F(n) = F(n-1) + F(n-2)\) is a transition chain. The critical feature: each link needs only a small, fixed number of previous links — not the entire history. That's why DP is efficient: the chain compresses all history into a few numbers. Matrix exponentiation is what you reach for when the chain is linear and \(n\) is huge.

Dependency chains — you can't process \(B\) until \(A\) is done. Course prerequisites, build systems, topological sort. DP itself follows a dependency chain under the hood: you process states in an order where every state's prerequisites are already computed. In 1D DP, that order is left-to-right. In tree DP, it's leaves-first (post-order). In interval DP, it's short intervals before long ones. The processing order always respects dependencies.

Signal Recognition

How do you know which chain type you're dealing with? Look for these signals:

Signal	Chain Type
"If \(X\) then \(Y\)" / forced values	Logical
Elements maintain ordering	Structural
"Answer for \(N\) depends on \(N-1\)"	Transition
"Can't do \(B\) until \(A\) done"	Dependency

The Unified View

Once you see chains everywhere, algorithms stop looking like separate topics. They're all chain-construction techniques:

Algorithm	What it's really doing
BFS	Building chains layer by layer
DP	Chains where each state depends on previous
Greedy	Building one chain, betting it's best
Monotonic stack	Maintaining structural chain, resolving when disturbed
Topological sort	Finding right order for dependency chain

These aren't five separate topics. They're five ways to build chains. When you see a new problem, ask: "What depends on what?" The answer tells you the chain type, and the chain type tells you the algorithm.

Often a problem involves multiple chain types layered together. DP on a tree uses structural chains (the tree edges) with transition chains (the recurrence) on top. Dijkstra's algorithm is a dependency chain (process nodes in distance order) with logical chains (relaxing an edge forces a bound on a neighbor). You don't need to classify perfectly — you need to recognize the chain structure so you know what to build.

The Decision Process

When you open a new problem, here's the order to think in:

Step 1: Read the constraint. What are the variables? What's the relationship between them?

Step 2: Try Fix first. Can you hold one variable still and deduce the rest? If yes, you probably have an \(O(N)\) or \(O(N \log N)\) solution with a hash map or binary search. Most interview problems live here.

Step 3: Look for Set structure. Are you collecting things? Counting subsets? Working with prime factors, divisors, or bitmasks? If the problem is about what you have, it's Set.

Step 4: Check for Shape. Is there a function you're optimizing or searching over? What's its geometry? Monotonic means binary search. Convex means ternary search. Staircase means block processing.

Step 5: Find the Chain. What depends on what? If you can write \(f(n)\) in terms of \(f(n-1)\), you have a transition chain (DP). If you can't process \(B\) until \(A\) is done, you have a dependency chain (topological sort). If one decision forces the next, you have a logical chain.

Most problems use one primary move. Some combine two. Very few need all four. The pillar connection box at the top of each lesson tells you which move drives that lesson.

How to Use This Course

Every lesson names its pillar in a box at the very top. When you read:

Pillar: Fix — "We're fixing the modulus and solving for the unknown coefficient."

...that's telling you which thinking move drives that lesson. Over time, you'll build a reflex: you see a problem, you identify the move, and you know what question to ask before you know the answer.

Each pillar gets progressively deeper treatment throughout the course:

Fix gets its first deep application in Ch3 (GCD/Bezout) and Ch4 (modular arithmetic)
Set drives Ch2 (prime factorization as multisets) and Ch6 (inclusion-exclusion)
Shape underlies Ch2 L2 (sieve analysis) and Ch9 L4 (floor staircase)
Chain powers Ch7 (expected value DP) and Ch10 (matrix exponentiation)

By the end, these four questions will be reflexive.

Quick Recap

Fix: Hold one variable still, the rest collapses. Ask "If I fix \(X\), what must \(Y\) be?" Use the Four Questions (What can I fix? What does it force? What remembers? What's the complexity?).
Set: Represent objects as collections, define operations on them. Ask "What am I collecting?" Sets connect to DP — the state is what you've collected.
Shape: Identify the geometry of the function before solving. Four shapes: increasing, decreasing, convex (bowl), concave (hill). The shape picks the algorithm.
Chain: Find the dependency graph, write the transition. Four chain types: logical, structural, transition, dependency. Ask "What depends on what?"
Every lesson in this course maps to one or more of these moves. The pillar connection box at the top tells you which.
Decision order: Try Fix first (most problems yield to it), then Set, then Shape, then Chain. If none feels right, re-read the constraints — you probably missed a variable you can fix or a structure you can exploit.

Try It — Classify These Problems

For each mini-problem below, decide which pillar is the primary thinking move. Some have a clear single answer; others could go two ways — pick the one you'd reach for first.

Given a sorted array, find the smallest element \(\geq k\).
You need \(\gcd(a, b, c)\). You know \(\gcd(a, b) = 6\).
Given \(a + b = 100\), and you want to maximize \(a \times b\).
Count the number of integers in \([1, n]\) not divisible by 2 or 3.
Compute the \(n\)-th term of the sequence \(a_n = 3a_{n-1} - a_{n-2}\).
You color node 1 of a graph red. What colors are forced on its neighbors?
Find the next greater element for each value in an array.
You have courses with prerequisites. Find a valid order to take them all.

Answers

Shape — the array is sorted (monotonic), so binary search applies.
Set — GCD is multiset intersection. \(\gcd(6, c)\) intersects the prime factors of \(6\) with those of \(c\).
Fix — fix \(a\), then \(b = 100 - a\). The product \(a(100 - a)\) is a single-variable function you can maximize. (You could also say Shape — the product is a concave parabola. Both valid.)
Set — inclusion-exclusion on the sets "divisible by 2" and "divisible by 3."
Chain — each term depends on the two before it. Write the recurrence, iterate. This is a transition chain.
Chain (logical) — one decision forces the neighbors' colors. That's a logical chain cascading through the graph.
Chain (structural) — maintain a decreasing monotonic stack. When a larger element arrives, it resolves everything smaller. That's a structural chain.
Chain (dependency) — each course depends on its prerequisites. Process them in topological order. That's a dependency chain.