Prüfer Sequences and Cayley’s Formula

How Many Labeled Trees Exist?
You have \(n\) vertices labeled \(1, 2, \ldots, n\). How many distinct trees can you build on them?
For \(n = 2\): one tree (edge between 1 and 2). For \(n = 3\): three trees (star from 1, star from 2, star from 3). For \(n = 4\): you can count them — there are 16.
The pattern: \(1, 1, 3, 16, 125, 1296, \ldots\) which is \(n^{n-2}\).
Cayley's Formula: The number of labeled trees on \(n\) vertices is \(n^{n-2}\).
The cleanest proof is a bijection. You need to show that every labeled tree corresponds to a unique sequence of length \(n - 2\) where each element is from \(\{1, 2, \ldots, n\}\). There are \(n^{n-2}\) such sequences, so there are \(n^{n-2}\) trees.
That sequence is the Prufer sequence.
Encoding: Tree → Prufer Sequence
Algorithm: Given a labeled tree on \(n\) vertices, produce a sequence of length \(n - 2\).
- Find the leaf with the smallest label.
- Record the neighbor of that leaf in the sequence.
- Remove that leaf from the tree.
- Repeat until only 2 vertices remain.
The sequence of recorded neighbors is the Prufer sequence.
Trace: Encoding a 6-Node Tree
Consider this tree:
Edges: (1,2), (1,3), (2,4), (2,5), (3,6).
| Step | Leaves (sorted) | Smallest leaf | Neighbor recorded | Remaining vertices |
|---|---|---|---|---|
| 1 | 4, 5, 6 | 4 | 2 | {1,2,3,5,6} |
| 2 | 5, 6 | 5 | 2 | {1,2,3,6} |
| 3 | 2, 6 | 2 | 1 | {1,3,6} |
| 4 | 6 | 6 | 3 | {1,3} |
We stop with 2 vertices remaining: \(\{1, 3\}\).
Prufer sequence: [2, 2, 1, 3]
Decoding: Prufer Sequence → Tree
Algorithm: Given a sequence of length \(n - 2\) with values from \(\{1, \ldots, n\}\), reconstruct the tree.
- Compute the degree of each vertex: degree(\(v\)) = (number of times \(v\) appears in the sequence) + 1.
- For each element \(s\) in the sequence (left to right):
- Find the smallest-labeled vertex \(v\) with degree 1.
- Add edge \((s, v)\).
- Decrement the degrees of both \(s\) and \(v\).
- After processing the sequence, exactly two vertices have degree 1. Add the edge between them.
Trace: Decoding [2, 2, 1, 3]
\(n = 6\). Sequence = [2, 2, 1, 3].
Initial degrees (count in sequence + 1):
| Vertex | Appearances | Degree |
|---|---|---|
| 1 | 1 | 2 |
| 2 | 2 | 3 |
| 3 | 1 | 2 |
| 4 | 0 | 1 |
| 5 | 0 | 1 |
| 6 | 0 | 1 |
| Step | Sequence element | Smallest vertex with deg 1 | Edge added | Updated degrees |
|---|---|---|---|---|
| 1 | 2 | 4 | (2, 4) | deg[2]=2, deg[4]=0 |
| 2 | 2 | 5 | (2, 5) | deg[2]=1, deg[5]=0 |
| 3 | 1 | 2 | (1, 2) | deg[1]=1, deg[2]=0 |
| 4 | 3 | 6 | (3, 6) | deg[3]=1, deg[6]=0 |
Remaining vertices with degree 1: {1, 3}. Add edge (1, 3).
Reconstructed edges: (2,4), (2,5), (1,2), (3,6), (1,3) — the original tree.
The Degree Property
Here's the structural insight that makes Prufer sequences useful beyond just proving Cayley's formula:
A vertex \(v\) appears in the Prufer sequence exactly \(\deg(v) - 1\) times.
Leaves (degree 1) never appear in the sequence — they get removed. The vertex with the highest degree appears the most. You can verify in our trace: vertex 2 has degree 3 and appears \(3 - 1 = 2\) times. Vertex 1 has degree 2 and appears once. Vertices 4, 5, 6 are leaves and never appear.
This means if you want to count labeled trees where vertex \(v\) has degree \(d_v\), you're counting sequences where \(v\) appears \(d_v - 1\) times. That's a multinomial coefficient:
subject to \(\sum (d_v - 1) = n - 2\), i.e., \(\sum d_v = 2(n-1)\) (which is always true for a tree).
The Code
vector<int> encodePrufer(int numVertices, vector<pair<int,int>>& edges) {
vector<vector<int>> adjacency(numVertices + 1);
vector<int> degree(numVertices + 1, 0);
for (auto [u, v] : edges) {
adjacency[u].push_back(v);
adjacency[v].push_back(u);
degree[u]++;
degree[v]++;
}
vector<bool> removed(numVertices + 1, false);
vector<int> sequence;
for (int step = 0; step < numVertices - 2; step++) {
int leaf = -1;
for (int v = 1; v <= numVertices; v++) {
if (!removed[v] && degree[v] == 1) {
leaf = v;
break;
}
}
removed[leaf] = true;
for (int neighbor : adjacency[leaf]) {
if (!removed[neighbor]) {
sequence.push_back(neighbor);
degree[neighbor]--;
break;
}
}
}
return sequence;
}
vector<pair<int,int>> decodePrufer(int numVertices, vector<int>& sequence) {
vector<int> degree(numVertices + 1, 1);
for (int label : sequence) {
degree[label]++;
}
vector<pair<int,int>> edges;
for (int label : sequence) {
for (int v = 1; v <= numVertices; v++) {
if (degree[v] == 1) {
edges.push_back({label, v});
degree[label]--;
degree[v]--;
break;
}
}
}
int firstRemaining = -1, secondRemaining = -1;
for (int v = 1; v <= numVertices; v++) {
if (degree[v] == 1) {
if (firstRemaining == -1) firstRemaining = v;
else secondRemaining = v;
}
}
edges.push_back({firstRemaining, secondRemaining});
return edges;
}
Complexity: \(O(n^2)\) for both encode and decode as written (scanning for the smallest leaf each step). Can be optimized to \(O(n)\) with a pointer trick, but \(O(n^2)\) is fine for most contest constraints.
Application: Counting Trees with Degree Constraints
Cayley's formula gives the total count \(n^{n-2}\). But what if you need the number of spanning trees of \(K_n\) where a specific vertex has degree exactly \(d\)?
Using the degree property, vertex \(v\) must appear exactly \(d - 1\) times in the Prufer sequence. Choose which \(d - 1\) of the \(n - 2\) positions hold \(v\): that's \(\binom{n-2}{d-1}\). The remaining \(n - d - 1\) positions each hold one of the \(n - 1\) other vertices: \((n-1)^{n-d-1}\).
Total: \(\binom{n-2}{d-1}(n-1)^{n-d-1}\).
You can verify: summing over \(d = 1, \ldots, n-1\) gives \(n^{n-2}\) by the binomial theorem. \((1 + (n-1))^{n-2} = n^{n-2}\).
Try It
Fill in the encodePrufer function in the starter code.
Predict before running: For a path graph \(1 - 2 - 3 - 4 - 5\), what is the Prufer sequence? The leaves are 1 and 5. Step 1 removes 1, records 2. Then leaves are 2 and 5. Step 2 removes 2, records 3. Step 3 removes 5, records 4. Sequence: [2, 3, 4].
Challenge: How many labeled trees on 5 vertices have vertex 1 as a leaf? Use the degree property: vertex 1 appears \(1 - 1 = 0\) times in the Prufer sequence. The 3 positions hold values from \(\{2,3,4,5\}\): that's \(4^3 = 64\) trees.
Edge Cases to Watch For
- \(n = 2\) (empty Prufer sequence): The Prufer sequence has length \(n - 2 = 0\). There's exactly one labeled tree: the single edge \((1, 2)\). The decoding algorithm must handle an empty sequence — if it expects at least one element, it fails or returns no edges.
- \(n = 1\) (Prufer undefined): Cayley's formula gives \(1^{-1} = 1\), but a single vertex has no edges and the Prufer sequence is undefined. Guard this as a special case in both encoding and decoding.
- Decoding with repeated values: A vertex appearing \(k\) times has degree \(k + 1\). The degree array must be initialized to 1 (not 0) — initializing to 0 makes every leaf invisible to the "find smallest leaf" scan, producing a wrong tree.
Problems
| Problem | Link | Difficulty |
|---|---|---|
| CF 156D — Spanning Tree Counting | codeforces.com/contest/156/problem/D | Hard |
| SPOJ PRUFER — Prufer Sequence | spoj.com/problems/PRUFER | Medium |