DSU Core
Two Lines of Code That Answer Connectivity in Constant Time
Graph connectivity seems like it should require traversal. BFS or DFS, right? Wrong. Disjoint Set Union answers "are nodes A and B connected?" with two array lookups and a pointer chase that, in practice, never exceeds 4 steps. No adjacency list needed. No visited array. Just two flat arrays.
The Data Structure

DSU maintains two arrays:
par[i]-- the parent of nodei. Initiallypar[i] = i(every node is its own root).rnk[i]-- an upper bound on the height of nodei's subtree. Initially 0.
Every set is a rooted tree. The root is the set's representative. Two elements are in the same set if and only if they have the same root.
Find: Chase the Root
find(x) follows the parent chain from x upward until it hits a node that is its own parent. That's the root.
This is O(tree height). In the worst case, the tree is a long chain and find is O(n). We need to fix that.
Path Compression: Flatten on the Way Up

After finding the root, set every node on the path to point directly to the root. Next time any of these nodes calls find, it's one hop.
That's it. One line of recursion. The tree collapses into a star shape over time.
💡 Key insight. Path compression doesn't change which set a node belongs to. It only changes the internal tree structure to make future queries faster.
Union by Rank: Keep Trees Short

When merging two sets, attach the shorter tree under the taller one. This prevents the tree from growing unnecessarily tall.
void unite(int a, int b) {
a = find(a); b = find(b);
if (a == b) return;
if (rnk[a] < rnk[b]) swap(a, b);
par[b] = a;
if (rnk[a] == rnk[b]) rnk[a]++;
}
Why rank and not size? Either works. Rank is simpler: it only increments when two trees of the same rank merge. Size-based union is equally valid and sometimes more convenient when you need component sizes.
⚠ Gotcha. Always call
find()on both arguments before comparing or linking. Operating on non-root nodes corrupts the structure.
Complexity: Inverse Ackermann
With both path compression and union by rank, any sequence of m operations on n elements takes O(m * alpha(n)) time, where alpha is the inverse Ackermann function.
alpha(n) <= 4 for any n up to 2^(2^(2^(2^16))). That's a number with about 20,000 digits. For all practical purposes, alpha(n) = constant.
| n | alpha(n) |
|---|---|
| 1 | 0 |
| 2-3 | 1 |
| 4-7 | 2 |
| 8-2047 | 3 |
| 2048 to 2^(2^16) | 4 |
You will never see alpha(n) = 5 in any real problem.
Full Trace
Start with 6 elements {1, 2, 3, 4, 5, 6}. Process these unions in order: unite(1,2), unite(3,4), unite(5,6), unite(1,3), unite(1,5).
| Operation | Action | par[] (after) | rnk[] (after) | Sets |
|---|---|---|---|---|
| init | -- | [1,2,3,4,5,6] | [0,0,0,0,0,0] | {1}{2}{3}{4}{5}{6} |
| unite(1,2) | 2's root under 1, rank tie -> rnk[1]++ | [1,1,3,4,5,6] | [1,0,0,0,0,0] | {1,2}{3}{4}{5}{6} |
| unite(3,4) | 4's root under 3, rank tie -> rnk[3]++ | [1,1,3,3,5,6] | [1,0,1,0,0,0] | {1,2}{3,4}{5}{6} |
| unite(5,6) | 6's root under 5, rank tie -> rnk[5]++ | [1,1,3,3,5,5] | [1,0,1,0,1,0] | {1,2}{3,4}{5,6} |
| unite(1,3) | rnk[1]==rnk[3], attach 3 under 1, rnk[1]++ | [1,1,1,3,5,5] | [2,0,1,0,1,0] | {1,2,3,4}{5,6} |
| unite(1,5) | rnk[1]>rnk[5], attach 5 under 1 | [1,1,1,3,1,5] | [2,0,1,0,1,0] | {1,2,3,4,5,6} |
After unite(1,5), call find(6): 6->5->1. Path compression sets par[5]=1 and par[6]=1. Next find(6) is one hop.
CSES Building Roads
Problem: A country has N cities and M roads. Find the minimum number of new roads needed to connect all cities.
Translation: Count connected components after processing all edges. Answer = components - 1.
#include <bits/stdc++.h>
using namespace std;
int par[200005], rnk[200005];
int find(int x) {
if (par[x] != x)
par[x] = find(par[x]);
return par[x];
}
void unite(int a, int b) {
a = find(a); b = find(b);
if (a == b) return;
if (rnk[a] < rnk[b]) swap(a, b);
par[b] = a;
if (rnk[a] == rnk[b]) rnk[a]++;
}
int main() {
int n, m;
cin >> n >> m;
for (int i = 1; i <= n; i++) {
par[i] = i;
rnk[i] = 0;
}
for (int i = 0; i < m; i++) {
int u, v;
cin >> u >> v;
unite(u, v);
}
// Collect component roots
vector<int> roots;
for (int i = 1; i <= n; i++)
if (find(i) == i) roots.push_back(i);
cout << (int)roots.size() - 1 << "\n";
for (int i = 1; i < (int)roots.size(); i++)
cout << roots[i - 1] << " " << roots[i] << "\n";
}
💡 Key insight. The problem also asks you to print which roads to build. Just chain the component roots together: connect root 1 to root 2, root 2 to root 3, etc.
Try It
The starter code has the DSU skeleton. Fill in find and unite, then count components.
Test case 1:
Test case 2:
Challenge: Rewrite the DSU using union by size instead of rank. Maintain a sz[] array and always attach the smaller component under the larger one. Verify that the results are identical.
Problems
| Problem | Link | Difficulty |
|---|---|---|
| CSES Building Roads | cses.fi/problemset/task/1666 | Easy |
| LC 547 — Number of Provinces | leetcode.com/problems/number-of-provinces | Medium |
| LC 684 — Redundant Connection | leetcode.com/problems/redundant-connection | Medium |