Primality and Factorization
Pillar: Set — "A number IS its set of prime factors. Factorizing a number = reading its prime multiset."
The Setup
Take the number \(360\). You can break it down: \(360 = 2 \times 180 = 2 \times 2 \times 90 = \ldots\) and eventually arrive at \(360 = 2^3 \times 3^2 \times 5^1\). Three copies of \(2\), two copies of \(3\), one copy of \(5\).
That decomposition isn't just a party trick from grade school. It's the identity of \(360\). The multiset \(\{2^3, 3^2, 5^1\}\) and the integer \(360\) are two names for the same thing — and the multiset form is the one that actually tells you something useful. From it you can read off how many divisors \(360\) has, what its GCD with any other number is, and whether it's a perfect square. The integer form hides all of that.
This lesson builds the machinery to go from an integer to its prime multiset: first testing whether a number is prime, then extracting its full factorization.
Every algorithm in this chapter and the next three chapters starts from the same foundation: given \(n\), produce \(\{p_1^{e_1}, p_2^{e_2}, \ldots, p_k^{e_k}\}\). Counting divisors, computing Euler's totient, checking square-freeness — all of these are one-liners once you have the factorization in hand. The factorization is the universal starting point.
What Is a Prime?
Forget the textbook phrasing ("divisible only by 1 and itself"). Think in terms of sets.
A prime \(p\) is an atom. Its prime multiset is \(\{p^1\}\) — a single element, no further decomposition possible. You can't break it into smaller pieces.
A composite number is a molecule. It's built by combining atoms. \(12 = 2^2 \times 3^1\) is a molecule made from atoms \(2\) and \(3\). The factorization tells you the recipe.
The number \(1\) is the empty multiset \(\{\}\). No atoms at all. This is why \(1\) is NOT prime — it's the identity element of multiplication, the "empty product." Calling \(1\) prime would break the uniqueness of factorizations (you could tack on as many factors of \(1\) as you like).
| Number | Atom or Molecule? | Prime Multiset |
|---|---|---|
| \(2\) | Atom | \(\{2^1\}\) |
| \(7\) | Atom | \(\{7^1\}\) |
| \(12\) | Molecule | \(\{2^2, 3^1\}\) |
| \(360\) | Molecule | \(\{2^3, 3^2, 5^1\}\) |
| \(1\) | Neither | \(\{\}\) |
The atom/molecule analogy isn't just a metaphor — it's how number theory actually works. Multiplication of integers corresponds to combining (unioning) their prime multisets. \(12 \times 18 = 216\) becomes \(\{2^2, 3^1\} \cup \{2^1, 3^2\} = \{2^3, 3^3\}\) where "union" means adding exponents. Division is subtraction of exponents. Divisibility (\(a \mid b\)) means \(a\)'s multiset is a sub-multiset of \(b\)'s — every exponent in \(a\) is \(\leq\) the corresponding exponent in \(b\).
Once you see this, questions about divisibility become questions about multiset containment. Much easier to reason about.
Primality Testing: Trial Division in \(O(\sqrt{n})\)
To test whether \(n\) is prime, you need to check if any integer between \(2\) and \(\sqrt{n}\) divides it. If none do, \(n\) is prime. If any one does, \(n\) is composite.
Why does \(\sqrt{n}\) suffice? Suppose \(n = a \times b\) where both \(a > 1\) and \(b > 1\). If \(a > \sqrt{n}\) AND \(b > \sqrt{n}\), then \(a \times b > n\) — contradiction. So at least one factor must be \(\leq \sqrt{n}\). You only need to find that small factor.
Trace it on \(n = 37\). You check \(2, 3, 4, 5, 6\) — and \(6^2 = 36 \leq 37\) so you check \(6\) too. None divide \(37\). Done: \(37\) is prime. You didn't need to check \(7\) through \(36\).
Trace it on \(n = 51\). You check \(2\) — no. Check \(3\) — yes, \(51 = 3 \times 17\). Done: \(51\) is composite. Found on the second try because the smallest factor is always \(\leq \sqrt{n}\).
One subtlety: the loop condition is divisor * divisor <= number, not divisor <= sqrt(number). These are mathematically equivalent, but sqrt() returns a floating-point value and rounding errors can cause it to miss the exact boundary. Integer arithmetic doesn't have that problem.
Another subtlety: you don't need to skip even numbers after \(2\) or skip composite trial divisors. The cost of checking "does \(4\) divide \(n\)?" when all factors of \(2\) are already gone is a single failed modulo — negligible. Optimizing the trial sequence (only checking primes) saves a constant factor but adds code complexity. In contests, the simple loop is the right call.
Factorizing \(n\) Into Primes
The algorithm is greedy: try each potential prime factor starting from \(2\), divide it out completely, then move on.
Step 1. Start with the smallest possible factor, \(2\). While \(n\) is divisible by \(2\), divide and count.
Step 2. Move to \(3\), then \(4\), then \(5\), and so on up to \(\sqrt{n}\). You don't need to skip composites — if \(n\) has already had all factors of \(2\) removed, it won't be divisible by \(4\) or \(6\) anyway. The inner while loop handles multiplicities: it keeps dividing by the current divisor until it no longer goes in evenly. This extracts the full exponent in one pass.
Step 3. After the loop, if the remaining value is greater than \(1\), it's the last prime factor — every smaller prime has already been divided out, and the loop's exit condition \(d^2 > n\) rules out a composite leftover. (The reason trial division up to \(\sqrt{n}\) suffices: any number has at most one prime factor strictly greater than \(\sqrt{n}\), since two such primes would multiply to more than \(n\).)
Worked Example: Factorize \(360\)
Trace every step. This is the exact sequence the algorithm executes.
Divisor = 2. Is \(360 \div 2 = 180\)? Yes. Divide: \(n = 180\), count \(= 1\). Is \(180 \div 2 = 90\)? Yes. Divide: \(n = 90\), count \(= 2\). Is \(90 \div 2 = 45\)? Yes. Divide: \(n = 45\), count \(= 3\). Is \(45 \div 2\) exact? No. Stop. Record \(2^3\).
Divisor = 3. Is \(45 \div 3 = 15\)? Yes. Divide: \(n = 15\), count \(= 1\). Is \(15 \div 3 = 5\)? Yes. Divide: \(n = 5\), count \(= 2\). Is \(5 \div 3\) exact? No. Stop. Record \(3^2\).
Divisor = 4. Check loop condition: \(4^2 = 16 > 5\). Loop ends.
Remaining value: \(n = 5 > 1\), so \(5\) is the leftover prime factor. Record \(5^1\).
Summary in table form:
| Divisor | \(n\) before | Divides? | Action | \(n\) after | Factors found |
|---|---|---|---|---|---|
| \(2\) | \(360\) | Yes | \(360 \to 180 \to 90 \to 45\) | \(45\) | \(\{2^3\}\) |
| \(3\) | \(45\) | Yes | \(45 \to 15 \to 5\) | \(5\) | \(\{2^3, 3^2\}\) |
| \(4\) | \(5\) | No (\(4^2 = 16 > 5\)) | Loop ends | \(5\) | \(\{2^3, 3^2\}\) |
| — | \(5\) | Remainder \(> 1\) | Add to factors | \(1\) | \(\{2^3, 3^2, 5^1\}\) |
Final result:
The multiset \(\{2^3, 3^2, 5^1\}\) — three atoms of \(2\), two atoms of \(3\), one atom of \(5\).
Notice that divisor \(4\) was never used. By the time the loop reached \(4\), all factors of \(2\) (and therefore \(4\)) were already gone. The outer for loop doesn't waste real work on composite trial divisors — it just fails the inner while condition immediately and moves on.
Also notice the loop exited at divisor \(4\) because \(4^2 = 16 > 5\). At that point, the algorithm knows the remaining \(n = 5\) must be prime — if it had any factor besides \(1\) and itself, that factor would be \(\leq \sqrt{5} \approx 2.2\), but we already divided out all factors of \(2\) and \(3\). So \(5\) goes directly into the result.
Why the Algorithm Is Correct
Two properties guarantee correctness:
1. Every factor found is prime. When the loop reaches divisor \(d\) and finds that \(d \mid n\), all primes smaller than \(d\) have already been fully divided out. So \(d\) can't be composite — if \(d = p \times q\) with \(p < d\), then \(p\) was already removed from \(n\), meaning \(n\) is no longer divisible by \(p\), and therefore not by \(d\). Contradiction.
2. No factor is missed. The inner while loop extracts the full exponent of each prime. The outer loop runs every potential divisor up to the point where \(d^2\) exceeds the current value of \(n\). Anything left after the loop must itself be prime — every smaller prime has been divided out, and any composite leftover would have a factor \(\leq \sqrt{\text{leftover}} < d\), which the loop already tried. The final if (number > 1) check catches it.
The Fundamental Theorem of Arithmetic
Every integer greater than \(1\) has a unique prime factorization (up to the order of the factors).
This isn't just a nice fact — it's the reason the multiset representation works at all. If factorizations weren't unique, the same number could correspond to multiple different multisets, and the whole framework collapses. But factorizations ARE unique, so there's a perfect bijection:
\(360\) maps to \(\{2^3, 3^2, 5^1\}\) and nothing else does. \(\{2^3, 3^2, 5^1\}\) maps to \(360\) and nothing else. This is what makes it safe to reason about numbers through their multisets — you never lose information going back and forth.
The uniqueness also explains why \(1\) must not be prime. If \(1\) were prime, then \(6 = 2 \times 3 = 1 \times 2 \times 3 = 1 \times 1 \times 2 \times 3 = \ldots\) would have infinitely many "different" factorizations. Excluding \(1\) from the primes keeps factorizations unique.
You might think uniqueness is obvious — "just keep dividing, you'll get the same primes every time." And for INTEGERS, that's true. But the theorem is deeper than it looks. In some number systems (like \(\mathbb{Z}[\sqrt{-5}]\)), unique factorization fails: \(6 = 2 \times 3 = (1 + \sqrt{-5})(1 - \sqrt{-5})\) are two genuinely different factorizations into irreducibles. The integers are special. The Fundamental Theorem is a property of \(\mathbb{Z}\), not a tautology. For competitive programming, you don't need the abstract algebra — but you should appreciate that the factorize function returning a unique answer is a theorem, not an accident.
Practical Bounds
For contest problems, knowing the limits of trial division saves you from reaching for heavier tools when you don't need them.
Time. For \(n \leq 10^{12}\), the square root is \(\sqrt{n} \leq 10^6\). A loop running \(10^6\) iterations finishes in a few milliseconds. Trial division handles single-number queries comfortably at this scale.
Space. The number of distinct prime factors of \(n\) is at most \(\log_2(n)\). The smallest prime is \(2\), so \(k\) distinct prime factors means \(n \geq 2^k\), giving \(k \leq \log_2(n)\). For \(n \leq 10^{12}\), that's at most about \(40\) distinct primes. In practice, most numbers have far fewer — the number with the most distinct prime factors below \(10^{12}\) is \(2 \times 3 \times 5 \times 7 \times 11 \times 13 \times 17 \times 19 \times 23 \times 29 \times 31 \times 37 = 7420738134810\) which already exceeds \(10^{12}\), so you get at most \(11\) distinct primes. The map you store the factorization in will be tiny.
Reading constraints. When a problem gives you \(n \leq 10^6\) and asks you to factorize a single number, trial division is overkill-fast. When it gives you \(n \leq 10^{12}\), trial division still works — \(10^6\) iterations is nothing. When it gives you \(Q\) queries each up to \(10^6\), and \(Q\) can be \(10^5\), you want a sieve so each factorization takes \(O(\log n)\) instead of \(O(\sqrt{n})\). That's the next lesson.
When trial division ISN'T enough. If you need to factorize many numbers (say, every integer from \(1\) to \(10^6\)), trial division per number costs \(O(N\sqrt{N})\) total, which for \(N = 10^6\) is around \(10^9\) — borderline TLE. A Smallest Prime Factor sieve precomputes a lookup table in \(O(N \log \log N)\), then factorizes any single \(n \leq N\) in \(O(\log n)\) by repeatedly dividing by the stored smallest factor. Total cost for all \(N\) numbers: \(O(N \log N)\). That's the next lesson.
Rule of thumb for contests: if the problem gives you one number to factorize, use trial division. If it gives you many numbers (or a range), reach for a sieve.
Callback: From Factorization to GCD and LCM
In Ch1 L2 you saw that \(\gcd\) takes the minimum exponent per prime and \(\text{lcm}\) takes the maximum. Now you have the tool to produce those exponents from scratch.
Given \(a = 360 = 2^3 \times 3^2 \times 5^1\) and \(b = 150 = 2^1 \times 3^1 \times 5^2\):
The factorize function below is the engine that makes this possible. Feed it any number, get the multiset, then apply min/max per prime.
You could also compute the number of divisors directly from the factorization. If \(n = 2^3 \times 3^2 \times 5^1\), the divisor count is \((3+1)(2+1)(1+1) = 24\). Each exponent can range from \(0\) to its maximum, and the choices are independent. The prime multiset gives you everything — GCD, LCM, divisor count, divisor sum — all in one pass.
Check whether a number is a perfect square? Every exponent in its factorization must be even. Check whether \(a\) divides \(b\)? Every exponent in \(a\)'s factorization must be \(\leq\) the corresponding exponent in \(b\)'s. These become one-line checks once you have the multisets.
The Code
Primality Test
bool isPrime(long long number) {
if (number < 2) return false;
for (long long divisor = 2; divisor * divisor <= number; divisor++) {
if (number % divisor == 0) return false;
}
return true;
}
\(O(\sqrt{n})\) time, \(O(1)\) space.
The guard number < 2 rejects \(0\), \(1\), and negative inputs. The loop variable is long long so that divisor * divisor doesn't overflow when number is large. The function returns at the first divisor found — no need to keep looking once you know the number is composite.
Prime Factorization
map<long long, int> factorize(long long number) {
map<long long, int> primeExponents;
for (long long divisor = 2; divisor * divisor <= number; divisor++) {
while (number % divisor == 0) {
primeExponents[divisor]++;
number /= divisor;
}
}
if (number > 1) primeExponents[number]++;
return primeExponents;
}
\(O(\sqrt{n})\) time, \(O(\log n)\) space (at most \(\log_2(n)\) distinct primes in the map).
Both functions use long long for the loop variable. This isn't paranoia — if number can be up to \(10^{12}\) and you use int divisor, then divisor * divisor overflows int when divisor reaches around \(46341\). The program silently computes garbage and either loops forever or returns wrong answers. Making the loop variable long long from the start eliminates the issue entirely.
Common Mistakes
-
Forgetting the remaining factor after the loop. If you skip the
if (number > 1)check,factorize(14)returns \(\{2^1\}\) instead of \(\{2^1, 7^1\}\). The \(7\) gets silently dropped because \(7 > \sqrt{14}\) so the loop never reaches it. Always check whether anything remains. -
Integer overflow in the loop condition. Writing
(int)divisor * divisor <= numberwhennumberis along longnear \(10^{12}\) meansdivisor * divisoroverflowsintaround \(46341^2\). Use(long long)divisor * divisoror make the loop variablelong longfrom the start. -
Treating \(1\) as prime. The number \(1\) has no prime factors — it's the empty multiset. The
if (number < 2) return falseguard inisPrimehandles this, andfactorize(1)correctly returns an empty map. But if you build your own logic without these guards, \(1\) will sneak through and corrupt your results. -
Not handling \(n = 0\) or negative inputs.
factorize(0)should not be called — \(0\) has no prime factorization. If your function doesn't guard against it, thewhile (number % divisor == 0)loop divides \(0\) by \(2\) forever (since \(0 \div 2 = 0\) and \(0 \bmod 2 = 0\)). Add a guard or document the precondition. -
Using
sqrt(n)instead ofdivisor * divisor.sqrt(49)might return \(6.9999999\) due to floating-point precision, causing your loop to stop before checking \(7\). The integer conditiondivisor * divisor <= numberis exact and avoids this trap entirely. -
Assuming \(n\) has at most 2 prime factors. Some beginners write factorization code that finds one factor and assumes the remainder is prime. That works for semiprimes like \(15 = 3 \times 5\), but fails for \(12 = 2^2 \times 3\). The correct approach: loop over ALL potential factors with the inner
while, not just find the first one.
Quick Recap
- A prime is an atom — its multiset is \(\{p^1\}\). Composites are molecules built from atoms. The number \(1\) is the empty multiset, not a prime.
- Trial division checks divisors from \(2\) up to \(\sqrt{n}\). If none divide \(n\), it's prime. \(O(\sqrt{n})\) time, \(O(1)\) space.
- Factorization divides out each prime greedily. After the loop, if the remaining value exceeds \(1\), it's a prime factor \(> \sqrt{n}\) — and there can be at most one.
- The Fundamental Theorem of Arithmetic guarantees unique factorization. This is what makes the bijection between integers and prime multisets valid.
- For \(n \leq 10^{12}\), trial division runs in under \(10^6\) iterations — fast enough for single queries. For batch factorization, use a sieve (next lesson).
Problems
| Problem | Link | Difficulty | Focus |
|---|---|---|---|
| LC 204 — Count Primes | leetcode.com/problems/count-primes | Medium | Sets the stage for sieve (next lesson) |
| LC 2507 — Smallest Value After Replacing With Sum of Prime Factors | leetcode.com/problems/smallest-value-after-replacing-with-sum-of-prime-factors | Medium | Repeated factorization until fixed point |
| CSES — Counting Divisors | cses.fi/problemset/task/1713 | Easy | Use factorization to count divisors via \((e_1+1)(e_2+1)\cdots\) |
Next lesson: The Sieve of Eratosthenes — marking all primes up to \(N\) in near-linear time, and the Smallest Prime Factor sieve for \(O(\log n)\) factorization.