Counting Orderings

A Deceptively Hard Question
How many valid topological orderings does a DAG have? For a chain 1 → 2 → 3, the answer is 1. For three disconnected nodes, it's 3! = 6. For anything in between, the answer depends on the exact structure of the DAG.
This problem is #P-hard in general -- meaning there's no known polynomial-time algorithm. But when N is small (say, N ≤ 20), bitmask DP solves it in O(2^N * N).
The Bitmask DP Idea
Think of building the topological ordering one node at a time, left to right. At each step, you've placed some subset of nodes. The next node you place must have all its predecessors already placed.
State: A bitmask mask representing which nodes have been placed so far.
Transition: For each node i not yet in mask, if all predecessors of i are in mask, then i is a valid next choice.
Recurrence:
for each valid i.Base: dp[0] = 1 (empty arrangement, one way to have placed nothing).
Answer: dp[(1 << N) - 1] (all nodes placed).
💡 Key insight. This is the same bitmask DP pattern used for TSP and Hamiltonian paths. The "which items have been placed" bitmask is one of the most versatile DP states in competitive programming.
Predecessor Bitmask
To check "all predecessors of i are in mask" efficiently, precompute a bitmask pred[i] where bit j is set if j must come before i.
int pred[20] = {};
for (int i = 0; i < m; i++) {
int u, v;
cin >> u >> v;
u--; v--; // 0-indexed
pred[v] |= (1 << u);
}
Now the check becomes: (pred[i] & mask) == pred[i].
This is a subset check. It asks: "Is pred[i] a subset of mask?" If yes, every predecessor of i has been placed.
The Full Algorithm
long long dp[1 << 20];
memset(dp, 0, sizeof(dp));
dp[0] = 1;
for (int mask = 0; mask < (1 << n); mask++) {
if (dp[mask] == 0) continue;
for (int i = 0; i < n; i++) {
if (mask & (1 << i)) continue; // i already placed
if ((pred[i] & mask) == pred[i]) // all preds placed
dp[mask | (1 << i)] += dp[mask];
}
}
cout << dp[(1 << n) - 1] << endl;
Trace on a Small Example
N = 3, constraints: 0 → 1, 0 → 2 (node 0 must come before both 1 and 2).
Predecessors: pred[0] = 000, pred[1] = 001, pred[2] = 001
| mask (binary) | mask (set) | dp[mask] | Valid next nodes | Transitions |
|---|---|---|---|---|
| 000 | {} | 1 | 0 (pred=000 ⊆ {}) | dp[001] += 1 |
| 001 | {0} | 1 | 1 (pred=001 ⊆ {0}), 2 (pred=001 ⊆ {0}) | dp[011] += 1, dp[101] += 1 |
| 010 | {1} | 0 | -- | skip |
| 011 | {0,1} | 1 | 2 (pred=001 ⊆ {0,1}) | dp[111] += 1 |
| 100 | {2} | 0 | -- | skip |
| 101 | {0,2} | 1 | 1 (pred=001 ⊆ {0,2}) | dp[111] += 1 |
| 110 | {1,2} | 0 | -- | skip |
| 111 | {0,1,2} | 2 | -- | done |
Answer: 2 orderings. They are: [0, 1, 2] and [0, 2, 1].
Makes sense: 0 must go first, then 1 and 2 can go in either order.
AtCoder abc041_d
Problem: N items (N ≤ 16) with M precedence constraints. Count valid orderings.
This is the exact problem our bitmask DP solves:
#include <bits/stdc++.h>
using namespace std;
int main() {
int n, m;
cin >> n >> m;
int pred[16] = {};
for (int i = 0; i < m; i++) {
int u, v;
cin >> u >> v;
u--; v--;
pred[v] |= (1 << u);
}
vector<long long> dp(1 << n, 0);
dp[0] = 1;
for (int mask = 0; mask < (1 << n); mask++) {
if (dp[mask] == 0) continue;
for (int i = 0; i < n; i++) {
if (mask & (1 << i)) continue;
if ((pred[i] & mask) == pred[i])
dp[mask | (1 << i)] += dp[mask];
}
}
cout << dp[(1 << n) - 1] << endl;
}
For N = 16, we have 2^16 * 16 = 1,048,576 transitions. Very fast.
Why This Works But Polynomial Algorithms Don't
The number of topological orderings can be exponential. A DAG with no edges on N nodes has N! orderings. Any algorithm that counts them must at least keep track of which subsets have been "completed," and there are 2^N subsets.
The bitmask DP is optimal in the sense that it visits each (mask, node) pair at most once. You can't do much better without exploiting special DAG structure.
⚠ Gotcha. For N > 20, this approach is infeasible. 2^20 * 20 is about 20 million -- fine. 2^25 * 25 is about 800 million -- too slow. For large N, counting topological orderings requires specialized techniques (or the problem has extra structure to exploit).
Larger Trace (N = 4)
N = 4, constraints: 0 → 2, 1 → 3
Predecessors: pred[0] = 0000, pred[1] = 0000, pred[2] = 0001, pred[3] = 0010
| mask | Set | dp | Valid next | Key transitions |
|---|---|---|---|---|
| 0000 | {} | 1 | 0, 1 | dp[0001]+=1, dp[0010]+=1 |
| 0001 | {0} | 1 | 1, 2 | dp[0011]+=1, dp[0101]+=1 |
| 0010 | {1} | 1 | 0, 3 | dp[0011]+=1, dp[1010]+=1 |
| 0011 | {0,1} | 2 | 2, 3 | dp[0111]+=2, dp[1011]+=2 |
| 0101 | {0,2} | 1 | 1 | dp[0111]+=1 |
| 1010 | {1,3} | 1 | 0 | dp[1011]+=1 |
| 0111 | {0,1,2} | 3 | 3 | dp[1111]+=3 |
| 1011 | {0,1,3} | 3 | 2 | dp[1111]+=3 |
| 1111 | all | 6 | -- | done |
(Masks not shown have dp = 0.)
Answer: 6. The constraints 0→2 and 1→3 are weak. Without them, 4! = 24. Each constraint roughly halves the count.
Connection to the DP Course
If you've studied bitmask DP before (e.g., TSP), the pattern is identical:
| Problem | State | Transition condition |
|---|---|---|
| TSP | mask = visited cities | Last city has edge to next city |
| Topo orderings | mask = placed nodes | All predecessors of next node in mask |
| Assignment problem | mask = assigned tasks | Task i available for person |
The bitmask represents "what's been done." The inner loop asks "what can be done next." This pattern appears everywhere in competitive programming when N ≤ 20.
Try It
The starter code has the structure. Fill in the DP loop.
Test case 1 -- chain:
Test case 2 -- no constraints:
Test case 3 -- partial constraints:
Challenge: Modify the code to also print all valid topological orderings, not just count them. Hint: backtrack from dp[(1<<n)-1], choosing which node was placed last.
Problems
| Problem | Link | Difficulty |
|---|---|---|
| AtCoder abc041_d — 切り絵 | atcoder.jp/contests/abc041/tasks/abc041_d | Medium |
| LC 1857 — Largest Color Value in a Directed Graph | leetcode.com/problems/largest-color-value-in-a-directed-graph | Hard |