The Rules of Modular World
Pillar: Fix — "Fix the modulus. Then addition, subtraction, and multiplication all stay within \([0, \text{mod})\). Division doesn't — that's the next lesson."
The Setup
A problem says: "Return the answer modulo \(10^9 + 7\)." You've computed a sum of \(3 \times 10^{18}\). You slap % (1e9 + 7) on the result and submit. Wrong answer. The sum overflowed long long three intermediate steps ago, and by the time you took the mod, the damage was done.
Modular arithmetic fixes this. You don't compute the giant number and then reduce — you reduce at every step. Addition, subtraction, multiplication all have clean rules that let you work entirely within \([0, \text{mod})\), never exceeding the modulus squared. But you need to know exactly which operations play nice with mod, which one doesn't, and where the implementation traps hide.
What Mod Means
Start with a concrete example. \(17 \div 5 = 3\) remainder \(2\). That remainder is the mod: \(17 \bmod 5 = 2\).
The formal way to say this: \(17 \equiv 2 \pmod{5}\), read "\(17\) is congruent to \(2\) modulo \(5\)." Two numbers are congruent mod \(m\) when their difference is divisible by \(m\).
Congruence means: \(a \equiv b \pmod{m}\) if and only if \(m \mid (a - b)\).
Check: \(17 - 2 = 15\), and \(5 \mid 15\). So \(17 \equiv 2 \pmod{5}\).
Another example: \(-3 \equiv 4 \pmod{7}\) because \(7 \mid (-3 - 4) = -7\). Negative numbers are congruent to positive ones — the mod operation wraps around like a clock.
The intuition: modular arithmetic is a world where numbers wrap around after reaching \(m\). The number line becomes a circle of circumference \(m\). Every integer maps to exactly one value in \(\{0, 1, 2, \ldots, m-1\}\).
The Four Rules
These are the operations that survive modular reduction. Each one lets you reduce operands before combining them.
Addition
You can reduce each operand first, add the reduced values, then reduce once more. The outer mod is needed because two values in \([0, m)\) can sum to at most \(2m - 2\).
Subtraction
Same idea, but the \(+m\) is critical. Without it, the difference can go negative. If \(a \bmod m = 1\) and \(b \bmod m = 4\) with \(m = 5\), then \(1 - 4 = -3\). Adding \(m\) gives \(-3 + 5 = 2\), and \(2 \bmod 5 = 2\). Correct.
Multiplication
Reduce each factor first, multiply, then reduce once more. The product of two values in \([0, m)\) is at most \((m-1)^2\), which fits in long long when \(m < 3 \times 10^9\).
Proof for multiplication. Write \(a = q \cdot m + r_a\) and \(b = q' \cdot m + r_b\), where \(r_a = a \bmod m\) and \(r_b = b \bmod m\). Then:
The first three terms all contain \(m\) as a factor, so they vanish under mod \(m\). What remains is \(r_a \cdot r_b \bmod m\). That's \((a \bmod m) \times (b \bmod m) \bmod m\). Done.
The proofs for addition and subtraction follow the same decomposition — write each operand as \(qm + r\), expand, and observe that every term containing \(m\) vanishes.
Division: Does NOT Work
Here's the trap. \(6 / 2 = 3\). But with \(m = 5\): \((6 \bmod 5) / (2 \bmod 5) = 1 / 2\). That's not an integer. Even if you try \(\lfloor 1/2 \rfloor = 0\), that's not \(3 \bmod 5 = 3\).
Division under mod requires a modular inverse — a different operation entirely. You can't just divide remainders. Ch4 L3 covers the fix.
The Negative Mod Trap
This is the single most common bug in modular arithmetic code.
In C++ and Java, the % operator can return negative values. Specifically, (-1) % 7 evaluates to -1, not 6. The language defines % as the remainder with the sign of the dividend, not the mathematical modulo.
Consider a problem where you compute (a - b) % mod. If a < b, the result is negative. Your answer is off by exactly mod, and you get Wrong Answer.
The universal fix:
Concrete example. You need \((3 - 10) \bmod 7\). Direct computation: (3 - 10) % 7 = (-7) % 7 = 0 — this one happens to work by luck. But try \((2 - 10) \bmod 7\): (2 - 10) % 7 = (-8) % 7 = -1. The correct answer is \(6\). Applying the fix: ((-1) % 7 + 7) % 7 = (-1 + 7) % 7 = 6. Correct.
Python does not have this problem — Python's % always returns a non-negative result when the divisor is positive. But in C++ and Java, you must always guard against it. Build the +mod into your helper functions so you never forget.
Overflow Prevention
When \(a\) and \(b\) are both near \(10^9\) and \(m = 10^9 + 7\), the product \(a \times b\) can reach \(\sim 10^{18}\). A long long holds up to about \(9.2 \times 10^{18}\), so a single multiplication is safe. But if you chain operations carelessly — multiplying three values without reducing between each step — you overflow.
The rule: reduce after every operation. Never let intermediate results exceed the range of your data type.
Two situations need extra care:
When \(m\) fits in 32 bits but products don't fit in 32 bits. Cast to long long before multiplying: (long long)a * b % mod. Without the cast, a * b overflows int before the mod even happens.
When \(m > 3 \times 10^9\). Now \((m-1)^2 > 9 \times 10^{18}\), which exceeds long long range. Two options in C++ (GCC):
- Use
__int128:(__int128)a * b % mod. This is a GCC extension, not standard C++, but every major competitive programming judge supports it. - Implement modular multiplication via repeated addition with doubling (analogous to binary exponentiation, but for multiplication). Rarely needed in practice.
For \(m = 10^9 + 7\) — the standard modulus — none of this matters. \((10^9 + 6)^2 \approx 10^{18}\), well within long long. But know the limits.
Why \(10^9 + 7\)?
Almost every competitive programming problem that asks for a modular answer uses \(10^9 + 7\). This isn't arbitrary. Three properties make it the standard:
It's prime. A prime modulus guarantees that every nonzero element has a modular inverse. This comes from Fermat's Little Theorem (Ch4 L3): if \(p\) is prime and \(\gcd(a, p) = 1\), then \(a^{p-2} \equiv a^{-1} \pmod{p}\). With a composite modulus, some elements have no inverse, and division becomes impossible. Primes avoid this entirely.
It fits in a 32-bit integer. \(10^9 + 7 = 1\,000\,000\,007 < 2^{31} - 1 = 2\,147\,483\,647\). You can store modular values in int if needed, though long long is safer for intermediate products.
Its square fits in a 64-bit integer. \((10^9 + 7)^2 \approx 10^{18} < 9.2 \times 10^{18} = 2^{63} - 1\). This means you can multiply two reduced values and take the mod without any risk of long long overflow. No __int128 needed.
Any prime under roughly \(3 \times 10^9\) has properties two and three. But \(10^9 + 7\) is easy to remember and has become the universal convention.
Congruence Notation
Two ways to express the same fact:
- Code notation: \(a \bmod m = b \bmod m\). Emphasizes the computed remainder.
- Congruence notation: \(a \equiv b \pmod{m}\). Emphasizes the relationship between \(a\) and \(b\).
Use % in code and \(\equiv\) in proofs. Congruence notation is cleaner for reasoning because it treats mod as a relation, not an operation. When you write \(a \equiv b \pmod{m}\), you're saying "\(a\) and \(b\) behave identically in the modular world." The four rules above become:
If \(a \equiv a' \pmod{m}\) and \(b \equiv b' \pmod{m}\), then:
- \(a + b \equiv a' + b' \pmod{m}\)
- \(a - b \equiv a' - b' \pmod{m}\)
- \(a \times b \equiv a' \times b' \pmod{m}\)
These are the compatibility properties of congruence. They say: you can substitute any congruent value and the result stays congruent. This is what makes modular arithmetic a consistent algebraic system, not just a computational trick.
Modular Cycles: Addition vs Multiplication
What happens when you keep adding the same number modulo \(m\)? Try \(a = 2\), \(m = 6\):
It cycles through \(\{0, 2, 4\}\) and never hits \(1, 3, 5\). The cycle size is \(m / \gcd(a, m) = 6 / 2 = 3\).
What if gcd(a, m) = 1 — does the additive cycle hit every number?
Yes. Try \(a = 2\), \(m = 5\): the sequence is \(2, 4, 1, 3, 0, 2, \ldots\) — all 5 values appear. When \(\gcd(a, m) = 1\), the "wrap-around shift" after each lap ensures every position gets visited before the cycle repeats. When \(\gcd(a, m) = g > 1\), you're locked to multiples of \(g\) — you can never escape that sub-grid.
Now try multiplication: powers of \(2\) modulo \(9\):
The cycle is \(\{2, 4, 8, 7, 5, 1\}\) — size \(6\). It never hits \(0, 3, 6\) (multiples of \(3\)).
The additive cycle of 2 mod 9 has size 9 (hits everything), but the multiplicative cycle has size 6. Why the difference?
Multiplication can only reach numbers coprime to \(m\). For \(m = 9\), the numbers coprime to \(9\) are \(\{1, 2, 4, 5, 7, 8\}\) — exactly \(\phi(9) = 6\) of them. You can never multiply your way into a multiple of \(3\) starting from \(2\). The multiplicative cycle size always divides \(\phi(m)\), not \(m\). This is why Euler's Totient Theorem uses \(\phi(m)\) as the exponent, not \(m - 1\).
This distinction — additive cycles governed by \(m / \gcd(a, m)\), multiplicative cycles governed by \(\phi(m)\) — is the foundation for everything in Ch4 L3 (Fermat) and Ch4 L5 (Euler's totient).
BigInt String Arithmetic
This section connects to a different flavor of modular thinking — one that shows up in FAANG interviews rather than competitive programming.
Problems like "Multiply Strings" (LeetCode 43) and "Add Two Numbers" (LeetCode 2) ask you to do arithmetic on numbers represented as strings or linked lists. No built-in big integer support allowed.
The core mechanics are modular arithmetic in disguise. Each digit position works mod 10. When you multiply two digits and get \(42\), the digit you write is \(42 \bmod 10 = 2\), and the carry is \(\lfloor 42 / 10 \rfloor = 4\). The carry propagates to the next position.
Trace it on \(123 \times 456\). Focus on position \((i=2, j=2)\), the ones digits: \(3 \times 6 = 18\). Write \(18 \bmod 10 = 8\), carry \(\lfloor 18/10 \rfloor = 1\). Position \((i=2, j=1)\): \(3 \times 5 = 15\), plus the carry from earlier positions. Each step is the same: compute, take mod 10, carry the quotient.
This is the physical, interview-friendly version of modular arithmetic. The modulus is \(10\), the carry is the quotient, and you process positions right to left. The algorithm is \(O(n \times m)\) where \(n\) and \(m\) are the lengths of the two strings — grade-school multiplication, but implemented precisely.
The Code
Modular Helpers
long long modAdd(long long first, long long second, long long mod) {
return ((first % mod) + (second % mod)) % mod;
}
long long modSub(long long first, long long second, long long mod) {
return ((first % mod) - (second % mod) + mod) % mod;
}
long long modMul(long long first, long long second, long long mod) {
return ((first % mod) * (second % mod)) % mod;
}
\(O(1)\) time, \(O(1)\) space for all three.
The modSub function includes the +mod to prevent the negative mod trap. The modMul function assumes \((m-1)^2\) fits in long long — true for \(m \leq 3 \times 10^9\).
String Multiplication
string multiplyStrings(string& first, string& second) {
int lengthFirst = first.size(), lengthSecond = second.size();
vector<int> product(lengthFirst + lengthSecond, 0);
for (int i = lengthFirst - 1; i >= 0; i--) {
for (int j = lengthSecond - 1; j >= 0; j--) {
int digitProduct = (first[i] - '0') * (second[j] - '0');
int position = i + j + 1;
int total = digitProduct + product[position];
product[position] = total % 10;
product[position - 1] += total / 10;
}
}
string result;
for (int digit : product) {
if (!(result.empty() && digit == 0)) {
result += to_string(digit);
}
}
return result.empty() ? "0" : result;
}
\(O(n \times m)\) time, \(O(n + m)\) space.
The key line is total % 10 for the current digit and total / 10 for the carry — mod and integer division, the two faces of the same operation. The leading-zero strip at the end handles cases like \(0 \times 12345\).
Common Mistakes
-
Forgetting
+modin subtraction.(a % mod - b % mod) % modgives negative results in C++/Java whena % mod < b % mod. Always addmodbefore the final%. This is the most common source of Wrong Answer in modular arithmetic problems. -
Overflowing before reducing. Writing
a * b * c % modinstead of(a % mod) * (b % mod) % mod * (c % mod) % mod. The intermediate producta * bmay overflow even if the final result fits. Reduce after every multiplication. -
Applying mod to division.
(a / b) % modis not the same as(a % mod) * modInverse(b, mod) % mod. Division does not distribute over mod. You need a modular inverse (Ch4 L3).
Quick Recap
- \(a \equiv b \pmod{m}\) means \(m \mid (a - b)\). Two numbers are congruent when they have the same remainder after dividing by \(m\).
- Addition, subtraction, and multiplication are compatible with mod — reduce operands first, combine, reduce again. Division is not — it requires a modular inverse.
- The negative mod trap: C++
%can return negative values. Fix with((x % m) + m) % m. Build this into your subtraction helper. - \(10^9 + 7\) is prime, fits in 32 bits, and its square fits in 64 bits — making it the standard competitive programming modulus.
- BigInt string arithmetic is mod 10 with carry forwarding — the same modular thinking applied digit by digit.
Problems
| Problem | Link | Difficulty | Focus |
|---|---|---|---|
| CSES — Exponentiation | cses.fi/problemset/task/1095 | Easy | Modular exponentiation warm-up (sets up Ch4 L2) |
| LeetCode 1498 — Number of Subsequences That Satisfy the Given Sum Condition | leetcode.com/problems/number-of-subsequences-that-satisfy-the-given-sum-condition | Medium | Modular counting with large powers of 2 |
| LeetCode 43 — Multiply Strings | leetcode.com/problems/multiply-strings | Medium | BigInt multiplication — carry mechanics as mod 10 |
Next lesson: Binary Exponentiation — computing \(a^n \bmod m\) in \(O(\log n)\) using repeated squaring.