Remove Edges from Tree
Level: L3-L5 Topics: Trees, Subtree Sizes, Connected Components, DFS
Problem Statement
You are given a tree with N vertices (numbered 1 to N) and M queries. Each query provides a pair (E_i, K_i), where E_i is an edge in the tree and K_i is a threshold value. For each query, determine the number of vertices in each connected component that would result from removing edge E_i, and report how many components have size greater than or equal to K_i.
More specifically: removing a single edge from a tree splits it into exactly two connected components. You need to compute their sizes efficiently across many queries.
Background & Constraints
- The tree has N vertices and N-1 edges.
- There are M independent queries — each query removes a single edge (the tree is restored between queries).
- 1 <= N <= 10^5
- 1 <= M <= 10^5
- An O(N * M) brute-force approach (re-running BFS/DFS per query) is suboptimal. The interviewer expects a better solution.
Examples
Example 1:
Tree (5 vertices):
1
/ \
2 3
/ \
4 5
Edges: (1,2), (1,3), (2,4), (2,5)
Query: Remove edge (1,2), K=2
Result: Two components — {1,3} with size 2 and {2,4,5} with size 3.
Both components have size >= 2, so the answer is 2.
Query: Remove edge (2,4), K=3
Result: Two components — {4} with size 1 and {1,2,3,5} with size 4.
Only one component has size >= 3, so the answer is 1.
Hints & Common Pitfalls
- Key insight: Root the tree at any vertex and pre-calculate the subtree size for every node using a single DFS. When you remove the edge between a node and its parent, the two resulting components have sizes
subtree_size[child]andN - subtree_size[child]. - Pre-computation makes queries O(1): After one O(N) DFS to compute all subtree sizes, each query can be answered in constant time by looking up which node is the child endpoint of the removed edge.
- Common mistake: Re-running a full traversal for each query, leading to O(N * M) time.
- Edge representation: You need to map each edge to the child node in the rooted tree. If the edge is given as (u, v), determine which of u or v is the child (the one farther from the root).
- Be careful with 1-indexed vs 0-indexed vertices.
Follow-Up Questions
- Can you answer each query in O(1) time after O(N) preprocessing?
- What if instead of removing one edge per query, you remove multiple edges simultaneously? How does the problem change?
- How would you handle the case where the graph is not a tree (i.e., it has cycles)?
- What if queries are cumulative — edges stay removed across queries? How would you track component sizes efficiently?
- Can you extend this to weighted trees where the "size" of a component is the sum of vertex weights?