Implicit State Space Search
From Brute Force to Dijkstra on Combinatorics
The core insight is a shift in perspective: from enumerating sets to defining transitions between them. Some problems ask for the \(K\)-th best answer from an exponentially large set, and you can find it without looking at most of the set.
Define an implicit graph where the best solution is the start node, neighbors are "slightly worse" solutions, and a priority queue extracts them in order. This is Dijkstra's algorithm applied to combinatorial optimization — and it's deeply greedy.
The Problem: K Smallest M-Element Subset Sums
Given: \(N\) integers and two parameters \(M\) and \(K\). Find: The \(K\) smallest sums among all subsets of exactly \(M\) elements.
The Greedy Choice: Define an implicit graph where the best subset is the start node. Each transition slides one element right or activates a left neighbor. A min-heap extracts the \(K\) smallest in order.
With \(N\) up to \(200{,}000\) and \(K\) up to \(200{,}000\), we cannot enumerate all \(\binom{N}{M}\) subsets. We need \(O(N \log N + K \log K)\).
Instead of asking "which subsets have the smallest sums?", ask: "starting from the smallest subset, what's the cheapest single change I can make?"

Why Brute Force Fails
A naive approach: generate all \(\binom{N}{M}\) subsets, compute their sums, sort, and take the first \(K\). For \(N = 200{,}000\) and \(M = 100{,}000\), \(\binom{N}{M}\) is astronomically large. Simply pruning the enumeration doesn't scale either — you have no way to know which subsets are small without examining them.
The mental model shift: stop thinking about the set of all subsets. Start thinking about a graph of transitions between subsets. Each node is a subset, each edge is a single modification (swap one element for another), and the edge weight is the change in sum. If every transition increases the sum, Dijkstra finds the \(K\) smallest in order.
The Implicit Graph
Base State: The Smallest M-Element Subset
Sort the array. The smallest \(M\)-element subset is \(\{a[0], a[1], \ldots, a[M-1]\}\) — the first \(M\) elements. Its sum is our starting point.
State Representation
A state is a struct with four fields:
| Field | Name | Meaning |
|---|---|---|
| currentSum | The subset's sum | What we're minimizing |
| originalIndex | Which element in the initial prefix we're currently "sliding" | The active slider |
| currentIndex | Where that element actually sits now (may have slid right) | Its current position in the sorted array |
| rightBoundary | The upper limit this element cannot cross | Prevents elements from jumping over each other |
The initial state is {initialSum, M-1, M-1, N} — we start by potentially sliding the rightmost element of the prefix.
The "Lanes" Mechanic
The rightBoundary is the secret to avoiding duplicate subsets. Think of each element in the initial prefix as occupying a lane — a designated territory \([\text{originalIndex}, \text{rightBoundary})\) within which it can slide. When we activate a new slider to the left, its lane ends where the current slider started. Lanes never overlap. Since elements slide independently within non-overlapping lanes, no two sequences of transitions produce the same subset.
Two Transitions
From any state, we generate at most two children:
Transition 1 — Slide Further Right: Move the current element one position to the right.
The sum increases by the difference between adjacent sorted elements. The boundary stays the same.
Transition 2 — Activate Left Neighbor: Once the element at originalIndex has actually shifted (i.e., currentIndex != originalIndex), its original position is now vacant. We can start sliding the element to its left into that vacancy.
The new slider's rightBoundary is currentIndex (it cannot cross the element that already shifted). This transition is only valid when the current element has moved at least once — otherwise there's no vacancy to slide into.

Trace: \(a = [1, 2, 3, 7, 9]\), \(M = 3\), \(K = 4\)
Sorted array: \([1, 2, 3, 7, 9]\). Base subset: \(\{1, 2, 3\}\), sum \(= 6\).
Initial state pushed to min-heap: {6, 2, 2, 5}.
Extraction 1: Pop {6, 2, 2, 5}
Output: 6 (subset \(\{a[0], a[1], a[2]\} = \{1, 2, 3\}\))
| Transition | Condition | New State | Meaning |
|---|---|---|---|
| Slide right | \(\text{pos}+1=3 < \text{bound}=5\) | \(\{10, 2, 3, 5\}\) | Replace 3 with 7: \(\{1,2,7\}\) |
| Activate left | idx\(=2 > 0\) but idx\(=\)pos → NO | — | Can't activate: element hasn't shifted yet |
Heap: [{10, 2, 3, 5}]
Extraction 2: Pop {10, 2, 3, 5}
Output: 10 (subset \(\{1, 2, 7\}\))
| Transition | Condition | New State | Meaning |
|---|---|---|---|
| Slide right | \(\text{pos}+1=4 < \text{bound}=5\) | \(\{12, 2, 4, 5\}\) | Replace 7 with 9: \(\{1,2,9\}\) |
| Activate left | idx\(=2 > 0\) and idx \(\neq\) pos\(=3\) | \(\{11, 1, 2, 3\}\) | Also slide element 1: \(\{1,3,7\}\) |
Heap: [{11, 1, 2, 3}, {12, 2, 4, 5}]
Extraction 3: Pop {11, 1, 2, 3}
Output: 11 (subset \(\{1, 3, 7\}\))
| Transition | Condition | New State | Meaning |
|---|---|---|---|
| Slide right | \(\text{pos}+1=3 < \text{bound}=3\) → NO | — | Element 1 can't cross into element 2's lane |
| Activate left | idx\(=1 > 0\) and idx \(\neq\) pos\(=2\) | \(\{12, 0, 1, 2\}\) | Also slide element 0: \(\{2,3,7\}\) |
Heap: [{12, 2, 4, 5}, {12, 0, 1, 2}]
Extraction 4: Pop {12, ...} (tie-break arbitrarily)
Output: 12
Final output: 6 10 11 12
Manual Verification
All 3-element subsets of \([1, 2, 3, 7, 9]\):
| Subset | Sum |
|---|---|
| \(\{1, 2, 3\}\) | 6 |
| \(\{1, 2, 7\}\) | 10 |
| \(\{1, 3, 7\}\) | 11 |
| \(\{1, 2, 9\}\) | 12 |
| \(\{2, 3, 7\}\) | 12 |
| \(\{1, 3, 9\}\) | 13 |
| \(\{2, 3, 9\}\) | 14 |
| \(\{1, 7, 9\}\) | 17 |
| \(\{2, 7, 9\}\) | 18 |
| \(\{3, 7, 9\}\) | 19 |
Sorted: 6, 10, 11, 12, 12, 13, 14, 17, 18, 19. The first 4 are 6, 10, 11, 12. Correct.
Complete Solution
#include <bits/stdc++.h>
using namespace std;
struct SubsetState {
long long currentSum;
int originalIndex;
int currentIndex;
int rightBoundary;
bool operator>(const SubsetState& other) const {
return currentSum > other.currentSum;
}
};
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int numElements, subsetSize, targetK;
cin >> numElements >> subsetSize >> targetK;
vector<long long> elements(numElements);
for (int i = 0; i < numElements; i++) {
cin >> elements[i];
}
sort(elements.begin(), elements.end());
long long initialSum = 0;
for (int i = 0; i < subsetSize; i++) {
initialSum += elements[i];
}
priority_queue<SubsetState, vector<SubsetState>, greater<SubsetState>> minHeap;
minHeap.push({initialSum, subsetSize - 1, subsetSize - 1, numElements});
while (targetK--) {
SubsetState top = minHeap.top();
minHeap.pop();
cout << top.currentSum << (targetK == 0 ? "" : " ");
if (top.currentIndex + 1 < top.rightBoundary) {
minHeap.push({
top.currentSum + elements[top.currentIndex + 1] - elements[top.currentIndex],
top.originalIndex,
top.currentIndex + 1,
top.rightBoundary
});
}
if (top.originalIndex > 0 && top.currentIndex != top.originalIndex) {
minHeap.push({
top.currentSum + elements[top.originalIndex] - elements[top.originalIndex - 1],
top.originalIndex - 1,
top.originalIndex,
top.currentIndex
});
}
}
cout << "\n";
return 0;
}
The code is remarkably short for the conceptual depth involved. The entire implicit graph machinery fits in a single while loop with two push operations.
Complexity Analysis
| Component | Cost |
|---|---|
| Sort the array | \(O(N \log N)\) |
| Compute base sum | \(O(M)\) |
| Extract \(K\) states from heap | \(O(K \log K)\) |
| Each extraction: at most 2 pushes | \(O(\log K)\) each |
| Total | \(O(N \log N + K \log K)\) |
The heap never holds more than \(O(K)\) elements (we push at most 2 per extraction), so each heap operation is \(O(\log K)\).
Space: \(O(N + K)\) for the array and heap.
Why This Is Greedy
The implicit state search is greedy in the purest sense:
-
Always pick the best available option: The min-heap guarantees we extract subset sums in increasing order. We never revisit or reconsider a choice.
-
No backtracking: Each extracted state produces the next output. Once extracted, it's done.
-
Monotonic progress: Each transition increases the sum (since the array is sorted, \(a[\text{pos}+1] \geq a[\text{pos}]\) and \(a[\text{idx}] \geq a[\text{idx}-1]\)). Child states are always \(\geq\) parent states. This is why Dijkstra works — edge weights are non-negative.
-
Local transitions, global order: We only look at two local moves (slide right, activate left neighbor), yet the priority queue ensures we discover the globally \(K\)-th smallest sum.
This is Dijkstra's algorithm applied to a combinatorial state space rather than a graph with explicit edges. The "graph" has exponentially many nodes (one per subset), but we only visit \(K\) of them.
Common Pitfalls
1. Forgetting the currentIndex != originalIndex guard on Transition 2. If the element hasn't shifted yet, there's no vacancy to its left. Without this guard, you generate invalid states and count subsets multiple times.
2. Using int instead of long long. With \(N\) up to \(200{,}000\) and values up to \(10^9\), subset sums overflow 32-bit integers.
3. Confusing the role of rightBoundary. The right boundary is NOT the array size for all states. It's inherited from the parent state and shrinks as you activate left neighbors. Mixing this up causes duplicate subsets.
Variations and Extensions
K-th Smallest Subset Sum (Any Size)
If \(M\) is not fixed (find the \(K\) smallest subset sums of any size), the implicit graph simplifies: - Start with the empty set (sum = 0) or the smallest single element - Transitions: "add the next element" or "replace the last element with the next one"
The fixed-size version is harder because the "activate left neighbor" transition is needed to navigate between \(M\)-element subsets without gaps.
K Largest Instead of K Smallest
Reverse the sort order (descending) and use the same algorithm. Or equivalently, use a max-heap and flip the transition signs.
Try It
Work through this example by hand before coding:
Array: \([2, 5, 8, 11, 14]\), \(M = 2\), \(K = 5\)
- What is the base subset and its sum?
- Draw the first 3 levels of the implicit graph (start node + its children + their children).
- In what order does the min-heap extract these states?
- Verify against brute force: list all \(\binom{5}{2} = 10\) pairs and their sums, sort them, and check that your first 5 match.
Expected output: 7 10 13 13 16
Predict before running: For array \([1, 1, 1, 1, 100]\), \(M = 3\), \(K = 4\), what does the algorithm output? How many distinct sums appear?
Challenge: Modify the algorithm to output the \(K\) largest \(M\)-element subset sums instead. What changes in the state definition and transitions?
Problems
| Problem | Link | Difficulty |
|---|---|---|
| CSES — K-Subset Sum 2 | cses.fi | Hard |
| K-th Smallest Subset Sum (any size) | Various OJs | Medium-Hard |