Skip to content

Auxiliary Virtual Nodes

Virtual node reduces O(k^2) edges to O(k) by connecting group through one node

The problem says "connect every node in range [L, R] to every other node in range [L, R]." That's \(O(N^2)\) edges. You can't afford it. Virtual nodes compress those edges to \(O(N \log N)\).

This is the heavyweight technique in implicit graph construction.


The Problem with Dense Edges

Consider a grid problem. From cell \((r, c)\), you can jump straight up to any cell \((r-i, c)\) for \(i = 1, 2, \ldots, r\) at cost \(1 + i\). Each cell has up to \(R\) outgoing edges. With an \(R \times C\) grid, that's \(O(R^2 \times C)\) total edges. For \(R = 10^5\), this blows up.

Or consider a shortest path problem: \(M\) operations, each adding cost-\(C\) edges between ALL pairs in some range \([L, R]\). Each operation could add \(O(N^2)\) edges. Naive construction is doomed.

Virtual nodes solve this by introducing intermediate nodes that act as "hubs."


Technique 1: Vertical Chain Compression

Problem setup: Grid with \(R\) rows and \(C\) columns. From cell \((r, c)\), you can jump up to cell \((r-i, c)\) at cost \(i\). Normal movement (left/right/down) costs 1 per step.

Naive edges: Each cell has \(O(R)\) upward edges. Total: \(O(R^2 \times C)\).

Virtual chain: For each column \(c\), add a chain of virtual nodes \(v_{1,c}, v_{2,c}, \ldots, v_{R,c}\). Connect them vertically:

  • \(v_{r,c} \to v_{r-1,c}\), cost 1 (climbing one step up the chain).
  • Cell \((r,c) \to v_{r,c}\), cost 1 (entering the chain).
  • \(v_{r,c} \to\) cell \((r,c)\), cost 0 (exiting the chain).

Jumping from row \(r\) up to row \(r-i\): enter chain at \(v_{r,c}\) (cost 1), climb \(i\) steps (cost \(i\)), exit at cell \((r-i, c)\) (cost 0). Total: \(1 + i\). Matches the original edge weights.

Edges after compression: \(O(R \times C)\) — linear in the grid size.

// Virtual node IDs: cell (r,c) = r*C + c
// Chain node for (r,c) = R*C + r*C + c
int cell(int r, int c) { return r * C + c; }
int chain(int r, int c) { return R * C + r * C + c; }

// Build chain edges for each column
for (int c = 0; c < C; c++) {
    for (int r = 0; r < R; r++) {
        // Enter chain from cell
        addEdge(cell(r, c), chain(r, c), 1);
        // Exit chain to cell
        addEdge(chain(r, c), cell(r, c), 0);
        // Climb up in chain
        if (r > 0)
            addEdge(chain(r, c), chain(r-1, c), 1);
    }
}

💡 Key insight. The virtual chain replaces \(O(R)\) edges per cell with 3 edges per cell. The chain acts as a highway — you enter it, travel along it, and exit. The cost decomposes into entry + travel + exit.


Technique 2: Segment Tree Graph

Problem: NIKKEI2019-2-Qual-D Shortest Path on a Line.

\(N\) nodes on a line. \(M\) operations. Each operation says: "add a directed edge of cost \(C\) from every node in \([L_1, R_1]\) to every node in \([L_2, R_2]\)." Find the shortest path from node 1 to node \(N\).

Naive: each operation adds \(O(N^2)\) edges. With \(M\) operations, total edges could be \(O(M \times N^2)\).

Segment tree construction: Build two segment trees over the \(N\) real nodes:

  • Up-tree (source tree): Real nodes are leaves. Each internal node connects to its children (downward edges, cost 0). A real node can "reach up" to any ancestor.
  • Down-tree (sink tree): Real nodes are leaves. Each internal node connects from its children (upward edges, cost 0). An ancestor can "push down" to any descendant.

For an operation "edge from range \([L_1, R_1]\) to range \([L_2, R_2]\) at cost \(C\)":

  1. Create one virtual hub node \(h\).
  2. Connect the \(O(\log N)\) up-tree nodes covering \([L_1, R_1]\) to \(h\), cost 0.
  3. Connect \(h\) to the \(O(\log N)\) down-tree nodes covering \([L_2, R_2]\), cost \(C\).

Any real node in \([L_1, R_1]\) can reach \(h\) via the up-tree (cost 0), then \(h\) pushes to any real node in \([L_2, R_2]\) via the down-tree (cost \(C\)). Total cost: \(C\). Correct.

Edges per operation: \(O(\log N)\) instead of \(O(N^2)\).

Total graph size: \(O(N \log N)\) nodes, \(O((N + M) \log N)\) edges.

#include <bits/stdc++.h>
using namespace std;
using ll = long long;
const ll INF = 1e18;

struct SegTreeGraph {
    int n, sz;
    // up_base: start of up-tree nodes, down_base: start of down-tree
    // real nodes: 0..n-1
    // up-tree: up_base..up_base+2*sz-1
    // down-tree: down_base..down_base+2*sz-1
    int up_base, down_base, total;
    vector<vector<pair<int, ll>>> adj;

    void build(int _n) {
        n = _n;
        sz = 1;
        while (sz < n) sz *= 2;
        up_base = n;
        down_base = n + 2 * sz;
        total = n + 4 * sz;
        adj.resize(total);

        // Up-tree: parent -> children (so node can reach ancestors)
        for (int i = 1; i < sz; i++) {
            adj[up_base + 2*i].push_back({up_base + i, 0});
            adj[up_base + 2*i+1].push_back({up_base + i, 0});
        }
        // Down-tree: parent -> children (so ancestor reaches nodes)
        for (int i = 1; i < sz; i++) {
            adj[down_base + i].push_back({down_base + 2*i, 0});
            adj[down_base + i].push_back({down_base + 2*i+1, 0});
        }
        // Connect real nodes to leaves
        for (int i = 0; i < n; i++) {
            adj[i].push_back({up_base + sz + i, 0});   // real -> up leaf
            adj[down_base + sz + i].push_back({i, 0});  // down leaf -> real
        }
    }

    void addRangeEdge(int l1, int r1, int l2, int r2, ll cost) {
        int hub = adj.size();
        adj.push_back({});
        // Connect up-tree [l1,r1] -> hub
        collectUp(l1 + sz, r1 + sz + 1, hub);
        // Connect hub -> down-tree [l2,r2]
        collectDown(l2 + sz, r2 + sz + 1, hub, cost);
    }

    void collectUp(int lo, int hi, int hub) {
        for (lo += 0, hi += 0; lo < hi; lo >>= 1, hi >>= 1) {
            if (lo & 1) { adj[up_base + lo].push_back({hub, 0}); lo++; }
            if (hi & 1) { hi--; adj[up_base + hi].push_back({hub, 0}); }
        }
    }

    void collectDown(int lo, int hi, int hub, ll cost) {
        for (lo += 0, hi += 0; lo < hi; lo >>= 1, hi >>= 1) {
            if (lo & 1) { adj[hub].push_back({down_base + lo, cost}); lo++; }
            if (hi & 1) { hi--; adj[hub].push_back({down_base + hi, cost}); }
        }
    }
};

⚠ Gotcha. The up-tree and down-tree have opposite edge directions. Up-tree edges point from children to parents (so sources can "climb up"). Down-tree edges point from parents to children (so the hub can "push down" to destinations). Reversing either one breaks everything.


Trace: Virtual Node Compression

5 real nodes. Operation: add cost-3 edges from range [1,3] to range [4,5].

Without virtual nodes: 6 edges (1->4, 1->5, 2->4, 2->5, 3->4, 3->5), each cost 3.

With segment tree: One hub node \(h\).

Edge Cost Purpose
up-node covering {1} -> h 0 source range
up-node covering {2,3} -> h 0 source range
h -> down-node covering 3 sink range
real node 1 -> up-leaf 1 0 enter up-tree
real node 2 -> up-leaf 2 0 enter up-tree
real node 3 -> up-leaf 3 0 enter up-tree
down-leaf 4 -> real node 4 0 exit down-tree
down-leaf 5 -> real node 5 0 exit down-tree

Path from real node 2 to real node 5: \(2 \to \text{up-leaf}_2 \to \text{up-parent}_{2,3} \to h \to \text{down-parent}_{4,5} \to \text{down-leaf}_5 \to 5\). Total cost: \(0+0+0+3+0+0 = 3\). Correct.

6 direct edges compressed to \(O(\log N)\) edges through the hub. For ranges of size \(N\), this saves \(O(N^2)\) edges per operation.


When to Use Virtual Nodes

Look for these triggers:

Trigger Technique
"Jump to any of K targets" Chain or star virtual node
"Connect all pairs in a range" Segment tree graph
"Connect all pairs sharing a property" Bipartite virtual nodes (Lesson 3)
"Jump up/down by variable amount" Vertical chain

The common thread: the problem describes \(O(N^2)\) or \(O(N \times K)\) edges implicitly. Virtual nodes compress them to \(O(N \log N)\) or \(O(N + K)\).


Complexity Analysis

Approach Nodes Edges Dijkstra
Naive (all pairs in range) \(N\) \(O(M \times N^2)\) too slow
Segment tree graph \(O(N \log N)\) \(O((N+M) \log N)\) \(O((N+M) \log^2 N)\)
Vertical chain \(O(R \times C)\) \(O(R \times C)\) \(O(RC \log(RC))\)

The segment tree approach trades node count for edge count. More nodes, far fewer edges. Dijkstra runs on the expanded graph as if it were any other graph.


Common Mistakes

Confusing up-tree and down-tree directions. The up-tree lets sources reach the hub. The down-tree lets the hub reach destinations. If both trees point the same direction, half the paths break.

Forgetting real-to-leaf connections. Real nodes must connect to their corresponding leaves in both trees. Without these bridge edges, real nodes are isolated from the segment tree structure.

Off-by-one in segment tree indexing. Whether your segment tree is 0-indexed or 1-indexed, and whether ranges are inclusive or exclusive, must be consistent. Test on small cases.


Problems

Problem Link Difficulty
ZONE2021-E Sneaking atcoder.jp/contests/zone2021/tasks/zone2021_e Hard
NIKKEI2019-2-Qual-D Shortest Path on a Line atcoder.jp/contests/nikkei2019-2-qual/tasks/nikkei2019_2_qual_d Hard
KEYENCE2019-E Connecting Cities atcoder.jp/contests/keyence2019/tasks/keyence2019_e Hard