Computing nCr Under Mod
Pillar: Chain — "Precomputing factorials is a chain: each \(\text{fact}[i] = \text{fact}[i-1] \times i\). Unwinding inverse factorials is the reverse chain."
The Setup
You know how to compute \(\binom{n}{k}\) — multiply and divide factorials. But the instant \(n\) exceeds about \(20\), the numbers overflow a 64-bit integer. The standard fix is to work modulo a prime \(m\) (usually \(10^9 + 7\)). Division becomes multiplication by a modular inverse.
The question: how do you organize this so that each \(\binom{n}{k}\) query costs \(O(1)\) after a single \(O(n)\) precomputation?
Three methods exist. Each wins in a different regime.
Method 1: Pascal's Triangle DP
The Idea
Pascal's identity:
No division, no inverses. Just addition modulo \(m\). Fill a 2D table bottom-up.
The Table (\(n\) up to \(5\))
| \(k=0\) | \(k=1\) | \(k=2\) | \(k=3\) | \(k=4\) | \(k=5\) | |
|---|---|---|---|---|---|---|
| \(n=0\) | \(1\) | |||||
| \(n=1\) | \(1\) | \(1\) | ||||
| \(n=2\) | \(1\) | \(2\) | \(1\) | |||
| \(n=3\) | \(1\) | \(3\) | \(3\) | \(1\) | ||
| \(n=4\) | \(1\) | \(4\) | \(6\) | \(4\) | \(1\) | |
| \(n=5\) | \(1\) | \(5\) | \(10\) | \(10\) | \(5\) | \(1\) |
Each cell is the sum of the cell directly above and the cell above-left.
Code
vector<vector<long long>> pascalTriangle(int maxN, long long mod) {
vector<vector<long long>> choose(maxN + 1, vector<long long>(maxN + 1, 0));
for (int totalItems = 0; totalItems <= maxN; totalItems++) {
choose[totalItems][0] = 1;
for (int chosenItems = 1; chosenItems <= totalItems; chosenItems++) {
choose[totalItems][chosenItems] = (choose[totalItems - 1][chosenItems - 1] + choose[totalItems - 1][chosenItems]) % mod;
}
}
return choose;
}
\(O(n^2)\) time, \(O(n^2)\) space.
When to Use
This is the right choice when \(n \leq 2000\) and \(m\) might not be prime. No modular inverse needed — Pascal's identity uses only addition. If the problem gives a composite modulus like \(10^6\) and \(n\) is small, this is your method.
The ceiling is hard: \(n = 2000\) means a table with \(4 \times 10^6\) entries. Push past \(n = 5000\) and you're burning too much memory and time.
Method 2: Precomputed Factorials + Inverse Factorials
This is THE standard method. Every competitive programmer uses it by default.
The Key Idea
The formula \(\binom{n}{k} = \frac{n!}{k!(n-k)!}\) becomes, under mod \(m\):
If you precompute \(\text{fact}[i] = i! \bmod m\) and \(\text{invFact}[i] = (i!)^{-1} \bmod m\) for all \(i\) from \(0\) to \(n\), then each query is three multiplications — \(O(1)\).
The forward chain builds factorials:
The backward chain builds inverse factorials. Here's the trick: you only need one modular exponentiation. Compute \(\text{invFact}[n] = \text{fact}[n]^{m-2} \bmod m\) using Fermat's Little Theorem (from Ch4 L3). Then unwind:
Why does this work? If \(\text{invFact}[i+1] = ((i+1)!)^{-1}\), then multiplying by \((i+1)\) gives:
That's the reverse chain. Each inverse factorial depends on the next one — you fill the array from right to left.
Full Trace: \(\binom{5}{2} \bmod 13\)
Large primes like \(10^9 + 7\) produce intermediate values that are impossible to verify by hand. To see the algorithm work end-to-end, use a small prime. Take \(m = 13\) and \(n = 5\):
Forward chain (\(m = 13\)):
| \(i\) | \(\text{fact}[i]\) |
|---|---|
| \(0\) | \(1\) |
| \(1\) | \(1\) |
| \(2\) | \(2\) |
| \(3\) | \(6\) |
| \(4\) | \(24 \bmod 13 = 11\) |
| \(5\) | \(11 \times 5 \bmod 13 = 55 \bmod 13 = 3\) |
Backward chain (\(m = 13\)):
\(\text{invFact}[5] = 3^{13-2} \bmod 13 = 3^{11} \bmod 13\).
\(3^2 = 9\), \(3^4 = 81 \bmod 13 = 3\), \(3^8 = 3^2 \bmod 13 = 9\), \(3^{11} = 3^8 \times 3^2 \times 3^1 = 9 \times 9 \times 3 = 243 \bmod 13 = 9\).
Check: \(3 \times 9 = 27 \bmod 13 = 1\). Confirmed: \(\text{invFact}[5] = 9\).
| \(i\) | \(\text{invFact}[i] = \text{invFact}[i+1] \times (i+1) \bmod 13\) |
|---|---|
| \(5\) | \(9\) |
| \(4\) | \(9 \times 5 = 45 \bmod 13 = 6\) |
| \(3\) | \(6 \times 4 = 24 \bmod 13 = 11\) |
| \(2\) | \(11 \times 3 = 33 \bmod 13 = 7\) |
| \(1\) | \(7 \times 2 = 14 \bmod 13 = 1\) |
| \(0\) | \(1 \times 1 = 1\) |
Verify each: \(\text{fact}[i] \times \text{invFact}[i] \bmod 13\) should be \(1\).
| \(i\) | \(\text{fact}[i]\) | \(\text{invFact}[i]\) | Product \(\bmod 13\) |
|---|---|---|---|
| \(0\) | \(1\) | \(1\) | \(1\) |
| \(1\) | \(1\) | \(1\) | \(1\) |
| \(2\) | \(2\) | \(7\) | \(14 \bmod 13 = 1\) |
| \(3\) | \(6\) | \(11\) | \(66 \bmod 13 = 1\) |
| \(4\) | \(11\) | \(6\) | \(66 \bmod 13 = 1\) |
| \(5\) | \(3\) | \(9\) | \(27 \bmod 13 = 1\) |
All check out. Now query:
And \(\binom{5}{2} = 10\). Confirmed.
The Code
Using the modpow function from Ch4 L2 and the precomputeFactorials function from Ch4 L4:
long long modpow(long long base, long long exponent, long long mod) {
long long result = 1;
base %= mod;
while (exponent > 0) {
if (exponent & 1) result = result * base % mod;
base = base * base % mod;
exponent >>= 1;
}
return result;
}
struct NCR {
vector<long long> factorial, inverseFactorial;
long long mod;
NCR(int limit, long long mod) : mod(mod) {
assert(limit < mod); // If limit >= mod, factorial[mod] = 0 and ALL inverse factorials become 0. Use Lucas' Theorem instead.
factorial.resize(limit + 1);
inverseFactorial.resize(limit + 1);
factorial[0] = 1;
for (int i = 1; i <= limit; i++) {
factorial[i] = factorial[i - 1] * i % mod;
}
inverseFactorial[limit] = modpow(factorial[limit], mod - 2, mod);
for (int i = limit - 1; i >= 0; i--) {
inverseFactorial[i] = inverseFactorial[i + 1] * (i + 1) % mod;
}
}
long long query(int totalItems, int chosenItems) {
if (chosenItems < 0 || chosenItems > totalItems) return 0;
return factorial[totalItems] % mod * inverseFactorial[chosenItems] % mod * inverseFactorial[totalItems - chosenItems] % mod;
}
};
\(O(n)\) precomputation, \(O(1)\) per query. Space: \(O(n)\) for the two arrays.
The single modpow call costs \(O(\log m)\), but it happens once. Everything else is \(O(n)\) linear work.
The Silent Array Death Trap: limit ≥ mod
If limit >= mod (e.g., limit = 20 with mod = 13), then factorial[13] = 13! mod 13 = 0. The forward loop propagates that zero: factorial[14], ..., factorial[20] are all 0. The inverse seed modpow(0, mod - 2, mod) = 0, and the backward loop propagates zero everywhere. Every nCr query returns 0.
This fails silently — no crash, no assertion, just wrong answers. The assert(limit < mod) in the constructor catches this. If your problem requires \(n \ge p\), you must use Lucas' Theorem (below) instead of the precomputation method.
Method 3: Lucas' Theorem
When You Need It
Two scenarios: the prime \(p\) is small (say \(p = 7\) or \(p = 23\)), or \(n\) is astronomically large (say \(10^{18}\)). In both cases, precomputing \(\text{fact}[0..n]\) is impossible — either \(n > p\) (so \(n! \equiv 0 \pmod{p}\) and the factorial method breaks) or \(n\) is simply too large to store.
The Statement
Lucas' Theorem: For a prime \(p\), write \(n\) and \(k\) in base \(p\):
Then:
Each \(\binom{n_i}{k_i}\) involves numbers less than \(p\), so you can compute them with a small Pascal table or directly. If any \(k_i > n_i\), the entire product is \(0\).
Worked Example: \(\binom{12}{5} \bmod 3\)
Write \(12\) and \(5\) in base \(3\):
- \(12 = 1 \times 9 + 1 \times 3 + 0 = (110)_3\)
- \(5 = 0 \times 9 + 1 \times 3 + 2 = (012)_3\)
Apply Lucas:
\(\binom{0}{2} = 0\) (can't choose \(2\) from \(0\)), so \(\binom{12}{5} \equiv 0 \pmod{3}\).
Check: \(\binom{12}{5} = 792 = 264 \times 3\). Indeed divisible by \(3\).
Another Example: \(\binom{10}{3} \bmod 7\)
- \(10 = 1 \times 7 + 3 = (13)_7\)
- \(3 = 0 \times 7 + 3 = (03)_7\)
Check: \(\binom{10}{3} = 120 = 17 \times 7 + 1\). So \(120 \bmod 7 = 1\). Confirmed.
Code
long long lucasNCR(long long totalItems, long long chosenItems, long long prime) {
long long result = 1;
while (totalItems > 0 || chosenItems > 0) {
long long totalDigit = totalItems % prime;
long long chosenDigit = chosenItems % prime;
if (chosenDigit > totalDigit) return 0;
long long numerator = 1, denominator = 1;
for (long long i = 0; i < chosenDigit; i++) {
numerator = numerator * (totalDigit - i) % prime;
denominator = denominator * (i + 1) % prime;
}
result = result % prime * (numerator % prime * modpow(denominator, prime - 2, prime) % prime) % prime;
totalItems /= prime;
chosenItems /= prime;
}
return result;
}
\(O(\log_p(n) \times p)\) time per query. No precomputation needed.
Decision Table
| Scenario | Method | Precompute | Query | Constraints |
|---|---|---|---|---|
| \(n \leq 2000\), any \(m\) | Pascal DP | \(O(n^2)\) | \(O(1)\) | Works with composite \(m\) |
| \(n \leq 10^7\), \(m\) prime | Fact + InvFact | \(O(n)\) | \(O(1)\) | \(m\) must be prime |
| \(n > m\) or \(m\) is small prime | Lucas | None | \(O(\log_p n \cdot p)\) | \(m\) must be prime |
| \(n \leq 10^7\), \(m\) composite, need exact \(\binom{n}{k} \bmod m\) | CRT + Lucas or Pascal | Varies | Varies | Factor \(m\) into prime powers |
The second row — precomputed factorials + inverse factorials — covers \(95\%\) of contest problems. When you see "answer modulo \(10^9 + 7\)," reach for this immediately.
Multinomial Coefficients
The multinomial coefficient counts the number of ways to partition \(n\) distinct items into groups of sizes \(k_1, k_2, \ldots, k_r\) where \(k_1 + k_2 + \cdots + k_r = n\):
Under mod, this is the same precomputed-factorial trick:
Application: CSES "Creating Strings II"
The problem: given a string of length \(n\), count the number of distinct permutations modulo \(10^9+7\).
If character \(c\) appears \(f_c\) times, the answer is:
This is a direct multinomial coefficient. Precompute factorials and inverse factorials up to \(n = 10^6\), then multiply.
Code
long long multinomial(int totalItems, vector<int>& groupSizes, vector<long long>& factorial, vector<long long>& inverseFactorial, long long mod) {
long long result = factorial[totalItems];
for (int groupSize : groupSizes) {
result = result % mod * inverseFactorial[groupSize] % mod;
}
return result;
}
\(O(r)\) per query, where \(r\) is the number of groups. The precomputation is the same \(O(n)\) NCR struct from above — just use its factorial and inverseFactorial arrays directly.
The Code (Complete)
All methods together. The NCR struct is the workhorse — use it by default.
#include <bits/stdc++.h>
using namespace std;
long long modpow(long long base, long long exponent, long long mod) {
long long result = 1;
base %= mod;
while (exponent > 0) {
if (exponent & 1) result = result * base % mod;
base = base * base % mod;
exponent >>= 1;
}
return result;
}
struct NCR {
vector<long long> factorial, inverseFactorial;
long long mod;
NCR(int limit, long long mod) : mod(mod) {
factorial.resize(limit + 1);
inverseFactorial.resize(limit + 1);
factorial[0] = 1;
for (int i = 1; i <= limit; i++) {
factorial[i] = factorial[i - 1] * i % mod;
}
inverseFactorial[limit] = modpow(factorial[limit], mod - 2, mod);
for (int i = limit - 1; i >= 0; i--) {
inverseFactorial[i] = inverseFactorial[i + 1] * (i + 1) % mod;
}
}
long long query(int totalItems, int chosenItems) {
if (chosenItems < 0 || chosenItems > totalItems) return 0;
return factorial[totalItems] % mod * inverseFactorial[chosenItems] % mod * inverseFactorial[totalItems - chosenItems] % mod;
}
};
long long multinomial(int totalItems, vector<int>& groupSizes, NCR& ncr) {
long long result = ncr.factorial[totalItems];
for (int groupSize : groupSizes) {
result = result % ncr.mod * ncr.inverseFactorial[groupSize] % ncr.mod;
}
return result;
}
long long lucasNCR(long long totalItems, long long chosenItems, long long prime) {
long long result = 1;
while (totalItems > 0 || chosenItems > 0) {
long long totalDigit = totalItems % prime;
long long chosenDigit = chosenItems % prime;
if (chosenDigit > totalDigit) return 0;
long long numerator = 1, denominator = 1;
for (long long i = 0; i < chosenDigit; i++) {
numerator = numerator * (totalDigit - i) % prime;
denominator = denominator * (i + 1) % prime;
}
result = result % prime * (numerator % prime * modpow(denominator, prime - 2, prime) % prime) % prime;
totalItems /= prime;
chosenItems /= prime;
}
return result;
}
vector<vector<long long>> pascalTriangle(int maxN, long long mod) {
vector<vector<long long>> choose(maxN + 1, vector<long long>(maxN + 1, 0));
for (int totalItems = 0; totalItems <= maxN; totalItems++) {
choose[totalItems][0] = 1;
for (int chosenItems = 1; chosenItems <= totalItems; chosenItems++) {
choose[totalItems][chosenItems] = (choose[totalItems - 1][chosenItems - 1] + choose[totalItems - 1][chosenItems]) % mod;
}
}
return choose;
}
Common Mistakes
-
Not handling edge cases in the query. \(\binom{n}{k}\) where \(k < 0\) or \(k > n\) should return \(0\). Forgetting the bounds check leads to out-of-bounds array access. The
if (chosenItems < 0 || chosenItems > totalItems) return 0;guard is essential. -
Precomputing up to \(n\) but querying \(\binom{n}{k}\) with \(n\) equal to the limit. Off-by-one: if you allocate
fact[0..limit]and the problem has \(n\) up to \(2 \times 10^5\), passlimit = 200000. Passinglimit = 199999and queryingquery(200000, k)reads past the array. -
Using the factorial method when \(n \geq m\). If \(n \geq m\) (where \(m\) is the prime modulus), then \(n! \equiv 0 \pmod{m}\) because one of the factors in the product is \(m\) itself. The inverse of \(0\) doesn't exist. For \(n \geq m\), use Lucas' theorem instead.
Quick Recap
- Pascal DP computes \(\binom{n}{k} \bmod m\) in \(O(n^2)\) with no inverses. Use it for \(n \leq 2000\) or composite \(m\).
- Precomputed factorials + inverse factorials is the standard: \(O(n)\) setup, \(O(1)\) query. One
modpowcall to seed \(\text{invFact}[n]\), then the backward chain fills the rest. - Lucas' theorem decomposes \(\binom{n}{k}\) into base-\(p\) digits. Use it when \(n\) is huge or \(p\) is small.
- Multinomial coefficients \(n! / (k_1! \cdots k_r!)\) use the same factorial/inverse-factorial arrays with \(r\) extra multiplications.
- The backward chain \(\text{invFact}[i] = \text{invFact}[i+1] \times (i+1)\) is the core trick — it avoids \(n\) separate
modpowcalls.
Problems
| Problem | Link | Difficulty | Focus |
|---|---|---|---|
| CSES — Creating Strings II | cses.fi/problemset/task/1715 | Medium | Multinomial coefficient with precomputed factorials |
| LC 1916 — Count Ways to Build Rooms | leetcode.com/problems/count-ways-to-build-rooms-in-an-ant-colony | Hard | Tree DP + multinomial nCr queries |
| CF 1288C — Two Arrays | codeforces.com/problemset/problem/1288/C | Medium | nCr under mod with stars-and-bars reduction |
| CSES — Binomial Coefficients | cses.fi/problemset/task/1079 | Easy | Direct application of precomputed fact + invFact |
| CC — SANDWICH | codechef.com/problems/SANDWICH | Hard | Stars & bars → nCr with \(N\) up to \(10^{18}\); composite modulus requires Lucas + Generalized Lucas + CRT. Tests the full pipeline: reduce to nCr, factor the modulus, apply Lucas per prime power, reassemble with CRT |
| CC — LUCASTH | codechef.com/problems/LUCASTH | Hard | Count \(k\) where \(\binom{n}{k} \not\equiv 0 \pmod{p}\) for \(n\) up to \(10^{501}\); the answer is \(\prod(d_i + 1)\) over the base-\(p\) digits of \(n\) — a direct consequence of Lucas' Theorem. Reveals that Lucas isn't just a computation tool but a structural theorem about which binomial coefficients survive mod \(p\) |
Next lesson: Stars and Bars — counting ways to distribute identical objects into distinct bins.