Defining States: The Art of DP

The hardest part of dynamic programming isn't the code. It's not the recurrence. It's not even figuring out the base case. The hardest part is answering one question:

What is my state?

Get the state right, and the transitions usually write themselves. Get it wrong, and no amount of cleverness will save you. You'll stare at the problem for an hour, try three different DP tables, and none of them work — because the state was broken from the start.

This page teaches you how to think about states. Not by showing you five DP problems and saying "here's the state for each." That teaches you to memorize, not to think. Instead, we'll build a systematic approach: extract the nouns, map them to parameters, and test whether your state captures enough information for a valid transition.

States Are Just Quantities

Here's the simplest way to think about a DP state: it's a snapshot of everything you need to know at a decision point, distilled into a few numbers.

Not "everything that happened so far." Not "the full history of choices." Just the numbers that determine what happens next. If two different histories lead to the same set of numbers, they're the same state — and you only need to solve it once. That's where the speedup comes from.

Let's make this concrete.

The Cheapest Flights Problem

You're in city \(A\). You want to reach city \(B\). Flights have prices. You're allowed at most \(K\) stops. Find the cheapest route.

What quantities exist in this problem? Read the statement carefully and pull out every noun:

City — where you currently are
Cost — how much you've spent so far
Stops — how many stops you've used

That gives you a state: (city, stops_used) → minimum cost to reach this city using this many stops.

Written as a DP table: dp[city][stops] = cheapest way to arrive at city having made exactly stops intermediate stops.

The transition: for each flight from prev_city to city with price \(p\), check if dp[prev_city][stops - 1] + p improves dp[city][stops].

Notice what happened. We didn't start with "this looks like a shortest path problem, let me use Dijkstra." We didn't pattern-match to a template. We just listed the quantities in the problem statement and asked: which of these do I need to track to know what I can do next?

The Noun Extraction Technique

This is the method. Read the problem. Pull out every noun — every quantity, every constraint, every thing that's being counted or measured or limited. Then ask:

Which of these change as I make decisions? Those are candidate state parameters.
Which of these determine what I can do next? If knowing a quantity affects my future options, it belongs in the state.
Which am I trying to optimize? That's the value stored in the DP table, not a parameter of the state.

For cheapest flights: city changes (I move), stops change (I use one each time), and cost is what I'm minimizing. So city and stops are state parameters. Cost is the DP value.

Most beginners skip this step. They jump straight to "what DP have I seen that looks like this?" and try to retrofit a memorized pattern. That works on easy problems. It falls apart on anything unfamiliar.

Parameters vs. Values: What Goes Where

A common source of confusion: should a quantity be a parameter of the state (part of the table indices) or the value stored at each state?

Rule of thumb: the thing you're optimizing (minimizing or maximizing) is the value. Everything else that matters is a parameter.

Problem	Parameters (state indices)	Value (stored)
Cheapest flights with K stops	city, stops_used	min cost
0/1 Knapsack	item_index, remaining_capacity	max profit
Edit Distance	position in string A, position in string B	min operations
Coin Change	amount remaining	min coins
LIS (basic)	index in array	max length ending here

Sometimes you can swap: instead of dp[city][stops] = min_cost, you could define dp[city][cost] = min_stops. Both are valid. One might be more efficient depending on the constraints. The point is that you choose which quantities are parameters and which is the value, based on what makes the transitions cleanest.

How Many Parameters Is Too Many?

Each parameter you add multiplies the state space. Two parameters with \(N\) values each give \(N^2\) states. Three give \(N^3\). At some point, you can't afford it.

Practical limits for competitive programming:

2 parameters, each up to \(10^3\)–\(10^4\): fine. \(10^6\)–\(10^8\) states, doable.
3 parameters: usually at least one needs to be small (say, \(\le 20\) or \(\le 100\)).
A set of up to 20 elements as a parameter: that's bitmask DP. \(2^{20} \approx 10^6\) states — feasible if the per-state work is small.

If your state has too many parameters, look for ways to reduce:

Can two parameters be combined? (e.g., "row and column" might be encodable as a single index)
Is one parameter derivable from the others? (e.g., if items are processed in order, the current item index might determine remaining capacity bounds)
Can you drop a parameter and still make valid transitions?

The LIS Problem: States as Chains

The Longest Increasing Subsequence is one of the most instructive DP problems, because the obvious state works but is slow, and the clever state requires you to rethink what you're tracking.

The Obvious State

dp[i] = length of the longest increasing subsequence ending at index \(i\).

Transition: for each \(j < i\) where \(A[j] < A[i]\):

\[dp[i] = \max(dp[i], \, dp[j] + 1)\]

For each \(i\), you check all previous indices \(j\). That's \(O(N^2)\).

This works fine for \(N \le 5000\). But for \(N = 10^5\) or larger, it's too slow.

Thinking in Chains

Let's change how we think about what we're building. An increasing subsequence is a chain — each element links to the next, and the next must be larger. dp[i] stores the length of the chain ending at position \(i\).

Now here's the greedy insight. Suppose two chains have the same length — say, both have length 3. One ends with the value 7, the other ends with 5. Which chain is "better" for future extensions?

The one ending with 5. Because any element that can extend the chain ending at 7 can also extend the chain ending at 5 — but not vice versa. A value of 6 would extend the 5-chain but not the 7-chain.

So if two chains have the same length, only keep the one with the smaller ending element. It strictly dominates.

The Smart Storage

Build an array tails[] where tails[k] = smallest possible ending element of any increasing subsequence of length \(k + 1\).

As you process each element \(x\) of the input:

If \(x\) is larger than everything in tails, append it. You've found a longer chain.
Otherwise, find the first element in tails that's \(\ge x\) and replace it with \(x\). You're saying: "I found a better (smaller) way to end a chain of this length."

Here's the crucial observation: tails is always sorted. Why? Because if you have a chain of length 5 ending at value 10, the chain of length 4 (which is a prefix of that chain, minus the last element) must end at something less than 10. Shorter chains always end with smaller values.

And since tails is sorted, you can binary search it. Finding the position to replace takes \(O(\log N)\).

The Implementation

vector<int> tails;
for (int x : A) {
    auto it = lower_bound(tails.begin(), tails.end(), x);
    if (it == tails.end())
        tails.push_back(x);   // extend the longest chain
    else
        *it = x;               // improve an existing chain
}
// answer is tails.size()

Total: \(O(N \log N)\).

Let's trace through A = [3, 1, 4, 1, 5, 9, 2, 6]:

Element	Action	`tails` after
3	Append (empty)	`[3]`
1	Replace 3 (1 < 3)	`[1]`
4	Append (4 > 1)	`[1, 4]`
1	Replace 1 (no change, 1 = 1)	`[1, 4]`
5	Append (5 > 4)	`[1, 4, 5]`
9	Append (9 > 5)	`[1, 4, 5, 9]`
2	Replace 4 (first \(\ge\) 2 is 4)	`[1, 2, 5, 9]`
6	Replace 9 (first \(\ge\) 6 is 9)	`[1, 2, 5, 6]`

Answer: 4. The longest increasing subsequence has length 4.

Important warning: tails is NOT an actual subsequence. After processing, tails = [1, 2, 5, 6], but the actual LIS might be [1, 4, 5, 9] or [1, 4, 5, 6]. The array tails just tracks the best possible ending values for each chain length. If you need the actual subsequence, you'll need to track predecessors separately.

What Made This Work?

Two things:

Redefining the state. Instead of "length of LIS ending at index \(i\)," we track "smallest ending value for each chain length." Different question, same underlying information, much faster to update.
Exploiting monotonicity. The tails array is always sorted, which means binary search applies. The Seeing Numbers as Curves page explained why sorted structure unlocks binary search — here's a direct application.

This is the general pattern for state optimization: find a smarter representation that maintains some structural property (sorted order, convexity, monotonicity), then exploit that property for speed.

Experimenting With States

Here's the habit that separates good problem-solvers from great ones: they try multiple state definitions before committing.

Most students find one state that seems reasonable and go all-in. If it works, great. If it doesn't, they're stuck — they've already spent 30 minutes building on a foundation that can't support the solution.

Better approach: spend 5 minutes brainstorming several possible states. For each one, ask:

Can I write the transition? If you define the state and can't figure out how to move from one state to another, the state is probably wrong — or at least incomplete.
Is the state space small enough? Count the number of possible values for each parameter. Multiply them. If the product exceeds \(\sim 10^8\), you need fewer parameters or smaller ranges.
Does the state capture enough to make decisions? If you're at state \((x, y)\) and you realize you also need to know \(z\) to decide your next move, then \(z\) belongs in the state.

Example: Revisiting Cheapest Flights

We defined dp[city][stops] = min_cost. But could we use dp[city][cost] = min_stops? Let's check:

Transition: from (city, cost), take a flight to next_city with price \(p\), arriving at state (next_city, cost + p).
State space: cities \(\times\) possible costs. If costs go up to \(10^7\), this is too large.
Decision capability: at each state we know where we are and how much we've spent, so we can determine valid next moves. ✓

The transition works, but the state space is too big. So dp[city][stops] = min_cost is better — stops is bounded by \(K\) (usually small), while cost can be huge.

This is the kind of reasoning you should practice. Both states are "correct" in the sense that they model the problem. One is just more practical.

The Connection to Sets

Go back to the vocabulary from Containers, Sets, and Operations. A DP state is a container — it holds information about a partial solution. The transition is an operation on that container — you're either adding one element (extend by a single choice) or merging two containers (combine sub-solutions).

When you define a state, you're really answering: "what does my container hold, and what's the cheapest summary I can get away with?"

For LIS, the "container" at each step is a set of chains. The summary is the tails array — the smallest ending value per chain length. That summary is lossy (you can't reconstruct which elements are in each chain), but it preserves exactly the information needed for future transitions.

For knapsack, the "container" is a set of items you've considered. The summary is the best profit for each possible weight used. Again, lossy — you don't know which items you picked — but sufficient for transitions.

The art of DP is finding the minimal sufficient summary of your container. Track too much and the state space explodes. Track too little and you can't write valid transitions. The sweet spot is what makes a problem solvable.

Practice Problems

These problems are ordered by how much "state design" thinking they require. The first few have obvious states; the later ones force you to be creative.

Warm-Up: Clear States

1. Climbing Stairs (LC 70)

State: dp[i] = number of ways to reach step \(i\).

Transition: dp[i] = dp[i-1] + dp[i-2]. You either came from one step below or two steps below.

Key lesson: The simplest possible state. One parameter (step number), one value (count of ways). Start here to verify you understand the framework.

2. Coin Change (LC 322)

State: dp[amount] = minimum coins needed to make amount.

Transition: For each coin \(c\), dp[amount] = min(dp[amount], dp[amount - c] + 1).

Key lesson: The "noun" is the amount remaining. That's the only thing that determines your options going forward. You don't need to know which coins you used — just how much is left.

3. 0/1 Knapsack

State: dp[i][w] = max profit using items \(1..i\) with capacity \(w\).

Transition: Either skip item \(i\) (dp[i-1][w]) or take it (dp[i-1][w - weight_i] + profit_i).

Key lesson: Two parameters because two constraints interact: which items you've considered and how much weight you've used. Neither alone is sufficient.

Core: State Design Matters

4. Cheapest Flights Within K Stops (LC 787)

State: dp[city][stops] = min cost to reach city using at most stops stops.

Transition: For each flight (u, v, price): dp[v][stops+1] = min(dp[v][stops+1], dp[u][stops] + price).

Key lesson: This is the example from the main text. The temptation is to use Dijkstra, but the stop limit makes it a DP problem. The "stops" parameter is what makes it different from plain shortest path.

Implementation note: Process by number of stops (layer by layer), like BFS. This is essentially Bellman-Ford limited to \(K\) iterations.

5. Longest Increasing Subsequence (LC 300)

State (basic): dp[i] = length of LIS ending at index \(i\). \(O(N^2)\).

State (optimized): tails[k] = smallest ending value for a chain of length \(k+1\). Binary search gives \(O(N \log N)\).

Key lesson: Same problem, two different state definitions, dramatically different performance. The optimized state exploits the fact that tails is always sorted.

See the LIS section above for the full walkthrough.

6. Edit Distance (LC 72)

State: dp[i][j] = min operations to convert the first \(i\) characters of word1 into the first \(j\) characters of word2.

Transition: Three choices at each step — insert, delete, or replace — each corresponding to moving in a different direction in the 2D table.

Key lesson: The "nouns" are positions in each string. Both positions change independently based on your operation choice. This is a classic 2D state that can't be reduced to 1D.

Harder: Creative State Design

7. N-Queens II (LC 52)

State: Which rows have been filled, and which columns/diagonals are under attack.

Approach: Process row by row. For each row, try placing a queen in each column that isn't attacked. The "state" is the set of attacked columns, attacked diagonals (\(r - c\)), and attacked anti-diagonals (\(r + c\)). With bitmask DP (board size \(\le 15\)), these sets fit in integers.

Key lesson: The state isn't about "where queens are" — it's about "what's still available." The constraints (attacked columns/diagonals) determine your future options, so that's what belongs in the state.

8. Reconstruct Itinerary (LC 332)

State: Current airport + set of remaining tickets.

Approach: This is a graph problem (Euler path), but if you think of it as a state problem: your state is where you are and which tickets you haven't used yet. The transition is using a ticket to fly somewhere.

Key lesson: The "set of remaining tickets" is a container. For small inputs, you could bitmask it. For this problem, a greedy DFS with Hierholzer's algorithm is more efficient — but the state-based thinking still clarifies what you're tracking.

9. Best Time to Buy and Sell Stock with Transaction Fee (LC 714)

State: dp[i][holding] where holding is 0 or 1.

dp[i][0] = max profit at day \(i\) without holding stock. dp[i][1] = max profit at day \(i\) while holding stock.

Transition: - dp[i][0] = max(dp[i-1][0], dp[i-1][1] + price[i] - fee) — either do nothing or sell. - dp[i][1] = max(dp[i-1][1], dp[i-1][0] - price[i]) — either do nothing or buy.

Key lesson: The "holding" flag is the insight. Without it, you don't know whether you can sell on the next day. One boolean parameter doubles the state space but makes transitions clean. This is a common pattern: a small flag that captures your "mode" or "phase."

What's Next

State definition is the core skill. But defining the state is only half the battle — you also need to transition between states efficiently. Many of the optimization techniques covered in When Shapes Survive: Function Addition — convex hull trick, slope trick, WQS binary search — are tools for speeding up transitions when the cost function has a nice shape.

For now, the practice is: every time you see a DP problem, don't jump to code. Spend 5 minutes listing nouns, sketching possible states, and checking whether each one supports valid transitions. That habit will serve you on every DP problem you ever encounter.