Tree Algorithms Deep Dive: Rerooting, Edge Contribution & Groups
In tree algorithmic problems, efficient traversal and state management are key. This post covers powerful techniques: the Running Pointer (Rerooting) for node-specific answers, the Edge Contribution method for global aggregation, and an advanced application for Interaction Costs in Groups.
1. Running Pointer & Rerooting Technique
The "Running Pointer" on trees is distinct from the classic array-based two-pointer approach. Instead of sliding a window, it focuses on fixing a node (rooting the tree) to calculate an answer, and then efficiently "moving" that root to adjacent nodes to update the answer dynamically.
💡 Pro Tip: The "Linear Tree" Visualization
Before tackling the full tree structure, it is often helpful to map your logic onto a linear path (a line graph).
If your recursion logic (e.g.,
total - subtree) works on a simple line \(1-2-3-4\), it will almost certainly work on a general tree. This simplifies the visualization of data flowing from "child" to "parent".
Phase 1: Initial Aggregation (DFS1)
Traverse the tree (usually Post-Order) to collect data from children to parents.
- Foundation - Counting Pairs: The simplest form of this aggregation is calculating the size of a subtree to determine how many pairs of nodes exist (\(NC2\)). The logic
total - subtree_sizeis the core building block for more complex problems. - Sum of Distances: We calculate:
sz[u]: Size of the subtree at \(u\).dis[u]: Sum of distances from \(u\) to all nodes in its subtree.- Transition:
dis[u] += dis[c] + sz[c](The distance increases by 1 for every node in the child's subtree).
Phase 2: Rerooting & Rollback (DFS2)
This step calculates the answer for every node by moving the root. When moving the root from parent \(u\) to child \(v\):
- Inverse Operation: Remove \(v\)'s contribution from \(u\).
- Forward Operation: Add \(u\)'s modified contribution to \(v\).
- Recursion: Process \(v\).
- Rollback: Strictly reverse the operations to restore \(u\).
⚠️ Critical Concept: Topological Update
When updating states, order matters!
* Dependency Rule: If state A depends on state B, update B first.
* Example: Distance depends on Subtree Size. Therefore, you must update sz before you update dis during the state change.
* Rollback: You must reverse operations in the exact opposite order (LIFO - Last In First Out). If you updated sz then dis, you must rollback dis then sz.
C++ Implementation: Sum of Distances (LeetCode 834)
Solves for the sum of distances from every node to all other nodes.
Click to expand solution code (LeetCode 834)
#include <vector>
#include <iostream>
using namespace std;
class Solution {
vector<int> dis, sz, ans;
vector<vector<int>> adj;
public:
vector<int> sumOfDistancesInTree(int n, vector<vector<int>>& edges) {
dis.resize(n);
sz.resize(n);
adj.resize(n);
ans.resize(n);
for(auto& v: edges) {
adj[v[0]].emplace_back(v[1]);
adj[v[1]].emplace_back(v[0]);
}
// DFS1: Calculate subtree size and initial distances for root 0
dfs();
// DFS2: Rerooting technique to calculate answer for all nodes
dfs2();
return ans;
}
void dfs(int x = 0, int p = 0) {
sz[x] = 1;
dis[x] = 0; // Initialize distance for leaf
for(int c: adj[x]) {
if(c == p) continue;
dfs(c, x);
// Aggregation: Distance to c + 1 step for every node in c's subtree
dis[x] += dis[c] + sz[c];
sz[x] += sz[c];
}
}
void dfs2(int x = 0, int p = 0) {
ans[x] = dis[x]; // Store answer for current root
for(int c: adj[x]) {
if(c == p) continue;
// --- STATE CHANGE (Reroot u -> v) ---
// TOPOLOGICAL UPDATE: Size must be updated before Distance
// 1. Remove c from x (x is no longer parent of c)
dis[x] -= dis[c] + sz[c];
sz[x] -= sz[c];
// 2. Add x to c (c becomes parent of x)
dis[c] += dis[x] + sz[x];
sz[c] += sz[x];
// Recursive call with new root
dfs2(c, x);
// --- ROLLBACK (Restore u <- v) ---
// STRICT REVERSE ORDER: If we did (A, B, C, D), we must rollback (D, C, B, A)
sz[c] -= sz[x];
dis[c] -= dis[x] + sz[x];
sz[x] += sz[c];
dis[x] += dis[c] + sz[c];
}
}
};
2. Edge Contribution Technique
This method is simpler and more efficient if you only need the Sum of All Pairs Distances (the total sum of distances between every pair of nodes in the tree, i.e., \(\sum_{i<j} \text{dist}(i,j)\)).
Core Concept
Instead of calculating paths relative to nodes, we look at the tree Edge-Centrically. We count how many times each edge contributes to the total sum.
Formula
If removing an edge \((u, v)\) splits the tree into two components of size \(S\) and \(N-S\), then this specific edge lies on the unique path between any node in component \(S\) and any node in component \(N-S\).
C++ Implementation: Contribution Technique (Recommended)
Calculates the single integer representing the total path sum in one DFS pass.
Click to expand solution code (Contribution Technique)
#include <vector>
#include <iostream>
using namespace std;
class SumAllDistancesContribution {
long long totalSum = 0;
int n;
vector<int> sz;
vector<vector<int>> adj;
public:
long long solve(int n, vector<vector<int>>& edges) {
this->n = n;
sz.resize(n);
adj.assign(n, vector<int>());
totalSum = 0;
for(auto& v: edges) {
adj[v[0]].push_back(v[1]);
adj[v[1]].push_back(v[0]);
}
dfs(0, -1);
return totalSum;
}
void dfs(int u, int p) {
sz[u] = 1;
for(int v : adj[u]) {
if(v == p) continue;
dfs(v, u);
sz[u] += sz[v];
// Contribution Logic:
// The edge (u-v) connects a subtree of size sz[v]
// to the rest of the tree (n - sz[v]).
// Number of paths crossing this edge = sz[v] * (n - sz[v])
long long contribution = (long long)sz[v] * (n - sz[v]);
totalSum += contribution;
}
}
};
3. Advanced Application: Interaction Cost in Groups
This technique shines in complex variations, such as LeetCode 3786.
Problem: You are given a tree where each node has a "group" (or color). You need to find the sum of distances between all pairs of nodes \((u, v)\) such that \(u\) and \(v\) belong to the same group.
- Constraints: \(N \le 10^5\), Number of Groups \(\le 20\) (or larger in generalized versions).
Adapted Edge Contribution Formula
Instead of a simple size calculation, we calculate the contribution of an edge for each group independently in a single pass. For a specific group \(g\): 1. Let \(C_g\) be the count of nodes of group \(g\) in the current subtree (below the edge). 2. Let \(T_g\) be the total count of nodes of group \(g\) in the entire tree. 3. The edge splits the group's nodes into \(C_g\) (below) and \(T_g - C_g\) (above). 4. Contribution for group \(g\): \(C_g \times (T_g - C_g)\).
We sum this contribution for all groups at every edge.
C++ Implementation (Single DFS)
This solution assumes the number of groups is small (e.g., \(\le 20\)). If groups are large (\(~N\)), use a map or DSU on Trees.
Click to expand solution code (LeetCode 3786 Variation)
#include <vector>
#include <numeric>
using namespace std;
class Solution {
long long totalInteractionCost = 0;
// Stores total count of each group in the entire tree
// Assumes group IDs are small (e.g., 1 to 20)
vector<int> global_group_counts;
vector<vector<int>> adj;
vector<int> node_groups;
int MAX_GROUP = 0;
public:
long long findSum(int n, vector<vector<int>>& edges, vector<int>& group) {
node_groups = group;
totalInteractionCost = 0;
adj.assign(n, vector<int>());
// 1. Determine max group ID and Precompute global counts
MAX_GROUP = 0;
for(int g : group) MAX_GROUP = max(MAX_GROUP, g);
global_group_counts.assign(MAX_GROUP + 1, 0);
for(int g : group) global_group_counts[g]++;
// Build Adjacency List
for(auto& e : edges) {
adj[e[0]].push_back(e[1]);
adj[e[1]].push_back(e[0]);
}
// 2. Run Single DFS
dfs(0, -1);
return totalInteractionCost;
}
// Returns a vector where ret[g] is the count of group 'g' in the current subtree
vector<int> dfs(int u, int p) {
vector<int> subtree_counts(MAX_GROUP + 1, 0);
subtree_counts[node_groups[u]] = 1; // Count the node itself
for(int v : adj[u]) {
if(v == p) continue;
// Get counts from child
vector<int> child_counts = dfs(v, u);
// Process Edge (u, v) Contribution for ALL groups
for(int g = 1; g <= MAX_GROUP; g++) {
if(child_counts[g] > 0) {
// Formula: Nodes_Below * Nodes_Above
long long below = child_counts[g];
long long above = global_group_counts[g] - below;
totalInteractionCost += below * above;
// Merge child counts into current subtree counts
subtree_counts[g] += child_counts[g];
}
}
}
return subtree_counts;
}
};
4. Practice: The "Fix the Edge" Mindset
Here are 4 tasks to practice the edge contribution mindset. The core skill is identifying what flows across the edge.
I. The "Traffic Jam" Variant (Pure Volume)
Task: 2477. Minimum Fuel Cost to Report to the Capital
- The Logic: Instead of counting pairs, we count cars moving to the root.
- Fix: The edge between
childandparent. - Calculate:
Traffic = Subtree_Size. Everyone must pass here. - Math:
Cost += ceil(Subtree_Size / Seats).
II. The "Splitting" Variant (Sum Product)
Task: 1339. Maximum Product of Splitted Binary Tree
- The Logic: Cutting an edge creates two separate trees.
- Fix: The edge being cut.
- Calculate: Sum of values in Subtree vs. Rest of Tree.
- Math: Maximize
Subtree_Sum * (Total_Sum - Subtree_Sum).
III. The "Flow" Variant (Net Balance)
Task: 979. Distribute Coins in Binary Tree
- The Logic: Coins must flow from rich nodes to poor nodes.
- Fix: The edge connecting a subtree to the rest of the world.
- Calculate: The imbalance between coins and nodes.
- Math:
Moves += abs(Subtree_Coins - Subtree_Size).
IV. The "Exact Match" Variant (Precursor to Groups)
Task: 1519. Number of Nodes in the Sub-Tree With the Same Label
- The Logic: Implement the frequency-map DFS (as seen in Section 3).
- Fix: The Subtree rooted at
u. - Calculate: How many times
Label[u]appears in the returned map from children. - Why it helps: This builds the machinery for the "Group Interaction" boss level.
5. Summary
The Contribution Technique is the master key for "Sum of X over all paths" problems.
- Identify the Element: Are you summing edges? Nodes?
- Isolate the Element: Pick one single edge. Pretend the rest of the tree doesn't exist.
- Count the Traffic: Calculate how many valid paths must touch this element using simple combinatorics (usually
Left_Size * Right_Size). - Run the Pointer: Use a DFS to compute these sizes recursively.