GCD and LCM as Set Operations
Pillar: Set — "GCD = intersection of prime multisets (take min exponents). LCM = union (take max exponents). This makes GCD and LCM dual operations."
The Setup
You have two numbers: \(12\) and \(18\). What's the largest number that divides both of them? You could list out divisors of each, find the overlap, and pick the biggest. That works, but it tells you nothing about why the answer is what it is.
Write those numbers as prime multisets instead. \(12 = 2^2 \times 3^1\). \(18 = 2^1 \times 3^2\). Now the question becomes: which prime factors do they share, and how many copies of each? That's a set intersection — and the answer drops out mechanically.
This lesson builds the full picture: GCD and LCM as set operations on prime factorizations, the identity that links them, and then the Euclidean algorithm — the \(O(\log n)\) machine that computes GCD without ever factoring anything.
GCD via Prime Factorization
In Ch2 L1 you saw that every integer has a unique prime multiset. Two numbers share a common divisor if and only if their multisets overlap. The greatest common divisor is the largest overlap — for each prime, take the minimum exponent from the two factorizations.
Trace it on \(12\) and \(18\):
For prime \(2\): \(\min(2, 1) = 1\). For prime \(3\): \(\min(1, 2) = 1\).
The GCD is \(6\). It's the intersection of the two prime multisets — the primes and exponents they have in common.
Think about why min works. A common divisor of \(12\) and \(18\) must divide both. For the prime \(2\), \(12\) has two copies and \(18\) has one. A common divisor can use at most one copy of \(2\) — the bottleneck. That bottleneck is the minimum. Same logic for every prime.
LCM via Prime Factorization
The least common multiple goes the other way: for each prime, take the maximum exponent.
The LCM is \(36\). It's the union of the two prime multisets — for each prime, keep as many copies as either number needs. A common multiple must be divisible by both \(12\) and \(18\). For the prime \(2\), \(12\) demands two copies and \(18\) demands one. You need at least two to satisfy \(12\) — the maximum.
GCD is intersection (min). LCM is union (max). Dual operations on the same multisets.
The Key Identity
Here's a fact that connects GCD and LCM without needing factorizations at all:
The proof is one line. For each prime \(p\) with exponents \(\alpha\) in \(a\) and \(\beta\) in \(b\):
This holds for every pair of non-negative integers. The exponent of \(p\) in \(\gcd \times \text{lcm}\) equals \(\min + \max = \alpha + \beta\), which is the exponent of \(p\) in \(a \times b\). True for every prime, so the products are equal.
Verify: \(\gcd(12, 18) \times \text{lcm}(12, 18) = 6 \times 36 = 216 = 12 \times 18\). Done.
This identity is more than a curiosity. It means you can compute LCM from GCD:
And since GCD is cheap to compute (you're about to see how), LCM is also cheap — no factorization needed.
The Euclidean Algorithm
The factorization approach to GCD is elegant for understanding but expensive for computing. Factoring a number takes \(O(\sqrt{n})\) time. The Euclidean algorithm computes GCD in \(O(\log \min(a, b))\) — exponentially faster — without touching prime factors at all.
The core recurrence:
Repeat until the second argument hits \(0\). The first argument at that point is the GCD.
Why It Works
Any common divisor of \(a\) and \(b\) also divides \(a \bmod b\). Here's why. Write \(a = q \times b + r\) where \(r = a \bmod b\). Then \(r = a - q \times b\). If \(d\) divides both \(a\) and \(b\), then \(d\) divides \(a - q \times b = r\). So \(d\) divides \(r\) too.
The argument runs in reverse as well: any common divisor of \(b\) and \(r\) also divides \(a = q \times b + r\). So the set of common divisors of \((a, b)\) is exactly the set of common divisors of \((b, r)\). Same set of divisors means the same greatest common divisor.
Trace Table: \(\gcd(252, 105)\)
Follow each step. At each iteration, replace \((a, b)\) with \((b, a \bmod b)\).
| Step | \(a\) | \(b\) | \(a \bmod b\) | Action |
|---|---|---|---|---|
| 1 | \(252\) | \(105\) | \(252 - 2 \times 105 = 42\) | Replace with \((105, 42)\) |
| 2 | \(105\) | \(42\) | \(105 - 2 \times 42 = 21\) | Replace with \((42, 21)\) |
| 3 | \(42\) | \(21\) | \(42 - 2 \times 21 = 0\) | Replace with \((21, 0)\) |
| 4 | \(21\) | \(0\) | — | \(b = 0\), stop |
Answer: \(\gcd(252, 105) = 21\). Three steps, three modulo operations. Verify: \(252 = 12 \times 21\), \(105 = 5 \times 21\), and \(\gcd(12, 5) = 1\). Correct.
Why \(O(\log \min(a, b))\)?
The Euclidean algorithm converges fast because the remainder shrinks aggressively. The key observation: after two consecutive steps, the larger number drops to less than half its original value.
Here's the argument. After one step, \((a, b) \to (b, r)\) where \(r = a \bmod b < b\). After the next step, \((b, r) \to (r, b \bmod r)\). Now consider two cases:
- If \(r \leq b/2\), the second argument already halved in one step.
- If \(r > b/2\), then \(b \bmod r = b - r < b/2\), so it halves in the next step.
Either way, every two steps reduce the smaller argument by at least half. Since a number can be halved at most \(\log_2(n)\) times before reaching \(0\), the algorithm terminates in at most \(2 \log_2(\min(a, b))\) steps.
The worst case is consecutive Fibonacci numbers. \(\gcd(F_{n+1}, F_n)\) takes exactly \(n - 1\) steps, and these are the slowest-converging inputs for their size. Since \(F_n \approx \phi^n / \sqrt{5}\) (where \(\phi = (1 + \sqrt{5})/2\)), the number of steps is proportional to \(\log_\phi(\min(a, b)) \approx 1.44 \log_2(\min(a, b))\).
Base Case: \(\gcd(a, 0) = a\)
When \(b = 0\), the GCD is \(a\). Every integer divides \(0\) (since \(0 = 0 \times d\) for any \(d\)), so the set of common divisors of \(a\) and \(0\) is just the set of divisors of \(a\). The greatest one is \(a\) itself.
This isn't just a boundary condition for the algorithm — it's the identity element of the GCD operation. In Ch1 L2 you saw that \((\mathbb{Z}_{\geq 0}, \gcd)\) forms a monoid. The identity element of that monoid is \(0\): \(\gcd(a, 0) = a\) for all \(a\). Zero is to GCD what \(1\) is to multiplication or \(0\) is to addition.
LCM Without Factoring
The identity \(\gcd(a,b) \times \text{lcm}(a,b) = a \times b\) gives you LCM for free once you have GCD:
One critical detail: divide first, then multiply. Writing a * b / gcd(a, b) risks overflow even when the final answer fits in a long long. If \(a\) and \(b\) are both near \(10^9\), their product exceeds \(10^{18}\) and overflows. But a / gcd(a, b) is at most \(a\), so multiplying that by \(b\) stays within bounds (assuming the LCM itself fits).
This is the standard LCM computation in competitive programming. You never factorize — just call GCD and rearrange.
What GCD Physically Means: Cycles and Coverage
So far, GCD is "min exponent on multisets." That's correct but abstract. Here's the physical interpretation that makes GCD problems solvable on sight.
Stepping on a Circular Track
Imagine a circular track with \(m\) positions numbered \(0\) to \(m - 1\). You start at position \(0\) and take steps of size \(a\). Which positions do you land on?
Try \(a = 2\), \(m = 6\): you visit \(0, 2, 4, 0, 2, 4, \ldots\) — only 3 positions. You're stuck on multiples of \(\gcd(2, 6) = 2\).
Try \(a = 2\), \(m = 5\): you visit \(0, 2, 4, 1, 3, 0, \ldots\) — all 5 positions. Since \(\gcd(2, 5) = 1\), the wrap-around "shifts" your landing point each lap until every position is hit.
The rule: stepping by \(a\) on a circle of \(m\) visits exactly \(m / \gcd(a, m)\) distinct positions. You visit ALL positions if and only if \(\gcd(a, m) = 1\).
Why can't you escape multiples of gcd(a, m)?
Every position you land on has the form \(k \cdot a \bmod m\) for some integer \(k\). By Bézout's identity, the set \(\{k \cdot a \bmod m : k \in \mathbb{Z}\}\) equals exactly the set of multiples of \(\gcd(a, m)\) modulo \(m\). If \(\gcd(a, m) = g\), then \(k \cdot a\) is always a multiple of \(g\), so you can never land on a position that isn't a multiple of \(g\). There are exactly \(m / g\) such positions.
The Grid Coverage Problem
This interpretation extends to 2D and demonstrates exactly how the pieces fit together.
Imagine a grid of \(n\) rows and \(m\) columns that wraps around (a torus). You start at \((0, 0)\) and alternate: jump \(a\) rows down, then \(b\) columns right, then \(a\) rows down, then \(b\) columns right, ...
Can you visit every cell? The answer decomposes into three conditions.
Condition 1 — Validate each axis. By the circular track rule:
- You reach all \(n\) rows \(\iff\) \(\gcd(a, n) = 1\)
- You reach all \(m\) columns \(\iff\) \(\gcd(b, m) = 1\)
If either fails, output NO immediately — some rows or columns are permanently unreachable.
Condition 2 — Combine the generators. If both GCDs are \(1\), then \(a\) is a generator of the row cycle and \(b\) is a generator of the column cycle. Since generators visit every position, stepping by \(a\) on \(n\) rows is equivalent to stepping by \(1\). Same for columns. This means the problem is isomorphic to stepping by \(1\) down and \(1\) right on the same \(n \times m\) grid.
Now look at what happens. Every complete pair of moves (down then right) advances by \((+1, +1)\) on the remapped grid. Starting at \((0, 0)\), the cells visited after \(k\) complete pairs form:
How many distinct cells does this diagonal sequence hit?
The diagonal \((k \bmod n, \; k \bmod m)\) repeats when \(k\) is simultaneously \(\equiv 0 \pmod{n}\) and \(\equiv 0 \pmod{m}\) — that is, when \(k = \text{lcm}(n, m)\). So the main diagonal visits exactly \(\text{lcm}(n, m)\) distinct cells.
Condition 3 — Account for alternating moves. Because moves alternate (down, right, down, right, ...), we don't just visit the diagonal \((k, k)\). We also visit intermediate cells halfway through each pair:
- If starting with "down": intermediates are \((k+1, k)\)
- If starting with "right": intermediates are \((k, k+1)\)
These intermediates form a second parallel cycle, also of length \(\text{lcm}(n, m)\). Prakul can choose which starting move to use, so the maximum number of distinct cells visited is:
For full coverage, this must reach all \(n \times m\) cells:
Substituting \(\text{lcm}(n, m) = \frac{n \times m}{\gcd(n, m)}\):
The full algorithm: output YES if and only if all three hold:
- \(\gcd(a, n) = 1\)
- \(\gcd(b, m) = 1\)
- \(\gcd(n, m) \leq 2\)
Three GCD calls. \(O(\log n)\) total.
Why does gcd(n, m) > 2 break coverage even when each axis is fully reachable?
When \(\gcd(n, m) = g > 2\), the diagonal \((k \bmod n, k \bmod m)\) visits only \(\text{lcm}(n, m) = nm/g\) cells. The intermediate cycle adds at most another \(nm/g\). Total: \(2nm/g < nm\) when \(g > 2\). The two cycles "sync up" — they keep revisiting the same subset of cells because the row and column periods share a large common factor. With \(g = 2\) you get exactly \(2 \times nm/2 = nm\) — barely enough. With \(g = 1\) you get \(2nm > nm\) — more than enough (the extra visits are revisits).
LCM as Synchronization
If GCD tells you the cycle size, LCM tells you when two cycles align.
Two events repeat with periods \(a\) and \(b\). When do they first coincide? At time \(\text{lcm}(a, b)\).
- Traffic lights at an intersection: one is green every \(30\) seconds, the other every \(45\) seconds. They're both green at the same time every \(\text{lcm}(30, 45) = 90\) seconds.
- Two gears with \(12\) and \(18\) teeth. After how many rotations of the small gear do both gears return to their starting position? After \(\text{lcm}(12, 18) / 12 = 3\) rotations of the small gear (and \(2\) of the large one).
In competitive programming, "when does a combined state repeat?" is almost always an LCM question.
Common Trap: \(\gcd(0, 0)\)
What's \(\gcd(0, 0)\)? Mathematically, every positive integer divides \(0\), so the set of common divisors of \(0\) and \(0\) is all positive integers — there is no greatest element. The GCD is undefined.
Most implementations return \(0\). C++ __gcd(0, 0) returns \(0\). Python math.gcd(0, 0) returns \(0\). This is a convention, not a mathematical truth. It's chosen because \(0\) is the identity element of GCD, so returning \(0\) makes the fold/reduce pattern work cleanly (see next section).
Know what your library does. If your problem can produce \(\gcd(0, 0)\), guard against it explicitly.
GCD of Multiple Numbers
GCD is associative: \(\gcd(\gcd(a, b), c) = \gcd(a, \gcd(b, c))\). It's also commutative: \(\gcd(a, b) = \gcd(b, a)\). And \(0\) is the identity. This means \((\mathbb{Z}_{\geq 0}, \gcd)\) is a commutative monoid — you can fold/reduce GCD across any collection of numbers in any order.
Start with the identity \(0\), then fold in each element. The order doesn't matter. Same pattern works for LCM (identity is \(1\), since \(\text{lcm}(a, 1) = a\)).
Trace it on \(\{24, 36, 60\}\):
- Start: \(\text{totalGCD} = 0\)
- Fold in \(24\): \(\gcd(0, 24) = 24\)
- Fold in \(36\): \(\gcd(24, 36) = 12\)
- Fold in \(60\): \(\gcd(12, 60) = 12\)
Answer: \(12\). Three Euclidean GCD calls, total time \(O(n \log(\max))\) where \(n\) is the array length.
The Code
Euclidean GCD (Iterative, Handles Negatives)
long long euclideanGCD(long long first, long long second) {
first = abs(first);
second = abs(second);
while (second != 0) {
long long remainder = first % second;
first = second;
second = remainder;
}
return first;
}
\(O(\log \min(a, b))\) time, \(O(1)\) space.
The abs() calls handle negative inputs — \(\gcd(-12, 18) = \gcd(12, 18) = 6\). GCD is defined on magnitudes. Without the guard, % on negative numbers gives negative remainders in C++, and the loop may produce wrong results or not terminate.
LCM with Overflow Prevention
long long leastCommonMultiple(long long first, long long second) {
if (first == 0 || second == 0) return 0;
return abs(first) / euclideanGCD(first, second) * abs(second);
}
\(O(\log \min(a, b))\) time, \(O(1)\) space.
The zero check avoids division by zero (since \(\gcd(0, 0) = 0\)). The division happens before the multiplication to prevent overflow. This is the one line where order of operations matters for correctness at scale.
GCD of Array via Fold
long long gcdOfArray(vector<long long>& numbers) {
long long totalGCD = 0;
for (long long value : numbers) {
totalGCD = euclideanGCD(totalGCD, value);
}
return totalGCD;
}
\(O(n \log(\max))\) time, \(O(1)\) space.
Starting from \(0\) (the identity) means this correctly handles empty arrays (returns \(0\)), single-element arrays, and arrays containing zeros. The monoid structure does all the work — no special cases needed.
Common Mistakes
-
Computing LCM as
a * b / gcd(a, b)instead ofa / gcd(a, b) * b. The multiplicationa * boverflowslong longwhen both values exceed about \(3 \times 10^9\). Always divide first. This is the single most common GCD/LCM bug in contests. -
Forgetting to handle negative inputs. The C++
%operator preserves the sign of the dividend: \((-7) \% 3 = -1\) in C++. If you pass negative numbers into the Euclidean algorithm without taking absolute values, the remainder alternates sign, and the loop may produce a negative "GCD" or behave unpredictably. -
Assuming \(\gcd(0, 0)\) is \(1\) or undefined at runtime. If your fold starts with \(\text{totalGCD} = 1\) instead of \(0\), then \(\gcd(1, n) = 1\) for all \(n\), and your result is always \(1\). The identity element is \(0\), not \(1\).
Quick Recap
- GCD = intersection of prime multisets (min exponents). LCM = union (max exponents). Dual operations.
- Key identity: \(\gcd(a, b) \times \text{lcm}(a, b) = a \times b\), because \(\min + \max = \text{sum}\) for each exponent.
- Euclidean algorithm: \(\gcd(a, b) = \gcd(b, a \bmod b)\), terminates in \(O(\log \min(a, b))\) steps. The proof hinges on every common divisor of \((a, b)\) also dividing \(a \bmod b\).
- LCM from GCD: divide first, multiply second —
a / gcd(a, b) * b— to avoid overflow. - GCD is a commutative monoid with identity \(0\). Fold/reduce across arrays with no special cases.
Problems
| Problem | Link | Difficulty | Focus |
|---|---|---|---|
| LeetCode 1979 — Find GCD of Array | leetcode.com/problems/find-greatest-common-divisor-of-array | Easy | GCD of array via fold |
| LeetCode 2413 — Smallest Even Multiple | leetcode.com/problems/smallest-even-multiple | Easy | LCM with a fixed value |
| CSES — Common Divisors | cses.fi/problemset/task/1081 | Medium | GCD across all pairs — requires thinking beyond pairwise Euclidean |
Next lesson: GCD Variations — subarray GCD properties, the \(O(\log \max)\) distinct-values bound, and prefix GCD techniques.