Graphs vs Trees

You already know trees. A tree is a node connected to smaller trees, no loops, no surprises. Graphs break that contract. One added edge creates a cycle, and suddenly your clean DFS spirals into an infinite loop. Understanding exactly what cycles change — and the one-line fix that tames them — is the foundation of every graph algorithm in this course.

Trees Were the Special Case

Tree vs graph comparison: trees are acyclic, graphs can have cycles and multiple paths

In the tree course, you never worried about revisiting a node. Parent-to-child was the only direction. There was exactly one path between any two nodes. That wasn't a design choice — it was a mathematical guarantee.

A tree on \(N\) nodes has exactly \(N - 1\) edges. Remove one edge and the tree splits into two components. Add one edge and you create exactly one cycle. Trees sit on the knife's edge between "connected" and "has a cycle."

A graph has no such constraint. \(N\) nodes can have anywhere from \(0\) edges (isolated nodes) to \(\frac{N(N-1)}{2}\) edges (every pair connected). Multiple paths between the same two nodes are normal. Some nodes might have no path between them at all.

Key insight. A tree is a connected, acyclic graph. Every tree is a graph. Not every graph is a tree. When you move from trees to graphs, you gain cycles and multiple paths — and you lose the guarantee that a simple DFS terminates.

What Cycles Actually Break

Consider this graph:

0 --- 1
|     |
2 --- 3

Start DFS at node 0. Visit 1. From 1, visit 3. From 3, visit 2. From 2 — node 0 is a neighbor. If you follow it, you're back at 0, and you'll visit 1 again, then 3, then 2, then 0 again. Forever.

On a tree, this never happens. A child never links back to an ancestor (ignoring the parent edge, which tree DFS naturally skips). On a graph, any node can link to any other node. Cycles mean edges that point "backward" to already-visited nodes.

The fix is one data structure: a visited array.

void dfs(int u, vector<vector<int>>& adj, vector<bool>& visited) {
    visited[u] = true;
    for (int v : adj[u]) {
        if (!visited[v]) dfs(v, adj, visited);
    }
}

That if (!visited[v]) guard is the entire difference between tree DFS and graph DFS. On trees you didn't need it. On graphs you can't survive without it.

DFS Trace

Graph: 0-1, 1-3, 3-2, 2-0 (the square above).

Step	Current	Neighbors	Action
1	0	1, 2	Mark 0 visited. Recurse into 1.
2	1	0, 3	Mark 1 visited. Skip 0 (visited). Recurse into 3.
3	3	1, 2	Mark 3 visited. Skip 1 (visited). Recurse into 2.
4	2	0, 3	Mark 2 visited. Skip 0 (visited). Skip 3 (visited). Backtrack.

Every node visited exactly once. Total work: \(O(V + E)\).

Gotcha. On trees, you sometimes skip the parent by passing it as a parameter: dfs(child, node). On graphs, that's not enough — a node can be reached from many paths, not just its parent. You need a full visited array.

One Path vs Many Paths

In a tree, the path from \(A\) to \(B\) is unique. You can find it by walking up to LCA and down. No choices, no alternatives.

In a graph, there might be zero paths (disconnected), one path, or many paths between two nodes. This multiplicity is what makes graph problems harder — and more interesting.

Shortest path — which of the many paths has minimum weight?
All paths — enumerate every route (exponential in the worst case).
Any path — just find one, DFS is enough.

Trees gave you the answer for free. Graphs make you work for it.

Directed vs Undirected

Undirected edges go both ways, directed edges are one-way

An undirected edge between \(u\) and \(v\) means traffic flows both ways. When building the adjacency list, you add \(v\) to \(u\)'s list AND \(u\) to \(v\)'s list.

A directed edge from \(u\) to \(v\) is one-way. You only add \(v\) to \(u\)'s list.

// Undirected
adj[u].push_back(v);
adj[v].push_back(u);

// Directed
adj[u].push_back(v);

Trees are usually rooted (directed parent-to-child) or unrooted (undirected). Graphs can be either. Directed graphs introduce new concepts — reachability is no longer symmetric. Node \(u\) can reach \(v\) without \(v\) being able to reach \(u\). This asymmetry drives topological sort, strongly connected components, and much of Chapters 3 and 14.

Weighted vs Unweighted

Weighted graph with numeric costs on each edge

An unweighted graph treats every edge as cost 1. BFS finds shortest paths.

A weighted graph assigns a cost to each edge. BFS no longer works for shortest paths — you need Dijkstra, Bellman-Ford, or Floyd-Warshall (Chapter 5).

For weighted graphs, the adjacency list stores pairs:

vector<vector<pair<int,int>>> adj(n); // adj[u] = {(v, weight), ...}
adj[u].push_back({v, w});
adj[v].push_back({u, w}); // if undirected

The Adjacency List

The adjacency list is the default representation. For each node, store a list of its neighbors.

int n, m;
cin >> n >> m;
vector<vector<int>> adj(n);
for (int i = 0; i < m; i++) {
    int u, v;
    cin >> u >> v;
    adj[u].push_back(v);
    adj[v].push_back(u); // undirected
}

Space: \(O(V + E)\). Iterating all neighbors of \(u\): \(O(\text{deg}(u))\). This is what you'll use 90% of the time.

Two other representations — adjacency matrix and edge list — are occasionally better. We'll cover when and why in the next lesson.

Connected Components

A graph might not be connected. It could have isolated clusters with no edges between them. Each cluster is a connected component.

Finding all components is the first real graph algorithm: run DFS from every unvisited node. Each DFS call explores one full component.

int countComponents(int n, vector<vector<int>>& adj) {
    vector<bool> visited(n, false);
    int count = 0;
    for (int i = 0; i < n; i++) {
        if (!visited[i]) {
            dfs(i, adj, visited);
            count++;
        }
    }
    return count;
}

Trace

Graph with 6 nodes: edges 0-1, 0-2, 3-4. Node 5 is isolated.

Outer loop \(i\)	visited[\(i\)]?	Action	Components so far
0	No	DFS from 0. Visits 0, 1, 2.	1
1	Yes	Skip.	1
2	Yes	Skip.	1
3	No	DFS from 3. Visits 3, 4.	2
4	Yes	Skip.	2
5	No	DFS from 5. Visits 5 only.	3

Result: 3 components — {0,1,2}, {3,4}, {5}.

Key insight. A connected graph has exactly 1 component. A tree on \(N\) nodes is a connected graph with exactly \(N-1\) edges. If a connected graph on \(N\) nodes has more than \(N-1\) edges, it must contain at least one cycle.

The Key Differences at a Glance

Property	Tree	Graph
Cycles	None	Possible
Paths between two nodes	Exactly 1	0, 1, or many
Edges for \(N\) nodes	Exactly \(N-1\)	\(0\) to \(\frac{N(N-1)}{2}\)
Connected	Always (by definition)	Not necessarily
DFS needs visited array	No (parent tracking suffices)	Yes (mandatory)
Directions	Rooted or unrooted	Directed or undirected

Exercises

Draw it out. Take a tree on 5 nodes. Add one edge to create a cycle. Run DFS mentally — at what point would you revisit a node without a visited array?
Edge math. A connected graph on 7 nodes has 9 edges. How many "extra" edges does it have beyond a spanning tree? What's the minimum number of edges you'd need to remove to eliminate all cycles?
Component count. A graph has 10 nodes and 3 connected components. What's the minimum number of edges? What's the maximum? (Hint: minimum is when each component is a tree. Maximum is when you make the components as dense as possible — put 8 nodes in one component.)
Directed reachability. In a directed graph with edges 0→1, 1→2, 2→0, 3→1: Can node 3 reach node 0? Can node 0 reach node 3? What does this asymmetry mean for "connected components" in directed graphs?

Problems

Problem	Link	Difficulty
LC 200 — Number of Islands	leetcode.com/problems/number-of-islands	Medium
LC 547 — Number of Provinces	leetcode.com/problems/number-of-provinces	Medium
CSES — Building Roads	cses.fi/problemset/task/1666	Easy
LC 684 — Redundant Connection	leetcode.com/problems/redundant-connection	Medium