Skip to content

Serialize and Deserialize

What's the Minimum Info to Uniquely Identify a Tree?

In the previous lesson, you needed two traversals — preorder and inorder — to reconstruct a tree. That's because a single traversal without any extra information is ambiguous.

But what if you could add extra information? Specifically: what if you marked where the nullptr children are?

One traversal with null markers is enough to uniquely identify any binary tree. The null markers encode the tree's shape. No second traversal needed.

This is serialization: converting a tree into a flat string that can be stored, transmitted, and later converted back into the exact same tree.


Why Inorder Alone Is Ambiguous

Before we serialize, understand why some approaches fail. Consider inorder: [2, 1, 3].

This could be:

    1           1           2
   / \           \         / \
  2   3           2       ?   ?
                   \
                    3

Wait — actually inorder [2, 1, 3] constrains things somewhat. But consider inorder [1, 2, 3]. This could be a right-skewed chain, a left-skewed chain (no — that would give [3, 2, 1]), or a balanced tree with root 2. The point is: inorder tells you the relative left-right ordering of nodes but not the parent-child structure.

Preorder alone is equally ambiguous. Preorder [1, 2, 3] could be:

    1            1
   / \            \
  2   3            2
                    \
                     3

Both produce preorder [1, 2, 3]. Without knowing where the nullptr children are, you can't distinguish these two trees.


Method 1: Preorder with Null Markers

The fix: when you encounter a nullptr child, write a special marker (we'll use #) instead of silently skipping it.

      1
     / \
    2   3
       / \
      4   5

Preorder (standard): [1, 2, 3, 4, 5] Preorder (with null markers): "1,2,#,#,3,4,#,#,5,#,#"

Every node has exactly two entries after it (left child, right child), even if those children are null. This makes the structure unambiguous.

Preorder serialization with null markers encodes tree structure unambiguously

Serialization Trace

Visit Node Output So Far
1 1 "1"
2 2 "1,2"
3 nullptr (left of 2) "1,2,#"
4 nullptr (right of 2) "1,2,#,#"
5 3 "1,2,#,#,3"
6 4 "1,2,#,#,3,4"
7 nullptr (left of 4) "1,2,#,#,3,4,#"
8 nullptr (right of 4) "1,2,#,#,3,4,#,#"
9 5 "1,2,#,#,3,4,#,#,5"
10 nullptr (left of 5) "1,2,#,#,3,4,#,#,5,#"
11 nullptr (right of 5) "1,2,#,#,3,4,#,#,5,#,#"

Final string: "1,2,#,#,3,4,#,#,5,#,#"

Deserialization Trace

Read the tokens left to right. Each token is either a value (create a node) or # (return nullptr).

Token Action Stack State (conceptual)
"1" Create node 1, recurse left Building left subtree of 1
"2" Create node 2, recurse left Building left subtree of 2
"#" Return nullptr (left of 2 = null) Back to node 2, recurse right
"#" Return nullptr (right of 2 = null) Node 2 complete, back to node 1
"3" Create node 3, recurse left Building left subtree of 3
"4" Create node 4, recurse left Building left subtree of 4
"#" Return nullptr (left of 4 = null) Back to node 4, recurse right
"#" Return nullptr (right of 4 = null) Node 4 complete, back to node 3
"5" Create node 5, recurse left Building left subtree of 5
"#" Return nullptr (left of 5 = null) Back to node 5, recurse right
"#" Return nullptr (right of 5 = null) Node 5 complete, tree done

Result: the original tree, perfectly reconstructed.

The Code

string serialize(TreeNode* root) {
    if (!root) return "#";
    return to_string(root->val) + "," +
           serialize(root->left) + "," +
           serialize(root->right);
}

TreeNode* deserializeHelper(queue<string>& tokens) {
    string token = tokens.front();
    tokens.pop();
    if (token == "#") return nullptr;
    TreeNode* node = new TreeNode(stoi(token));
    node->left = deserializeHelper(tokens);
    node->right = deserializeHelper(tokens);
    return node;
}

TreeNode* deserialize(string data) {
    queue<string> tokens;
    stringstream stream(data);
    string token;
    while (getline(stream, token, ',')) {
        tokens.push(token);
    }
    return deserializeHelper(tokens);
}

Complexity: \(O(n)\) time and space for both serialize and deserialize.

The key to deserialization: the token queue acts as a stream. Each recursive call consumes exactly the tokens it needs (one for the node, then all tokens for left subtree, then all tokens for right subtree). The preorder structure guarantees the tokens are in the right order.


Method 2: Level-Order with Null Markers

Instead of preorder DFS, use BFS. Write nodes level by level, with # for null children.

      1
     / \
    2   3
       / \
      4   5

Level-order with null markers: "1,2,3,#,#,4,5,#,#,#,#"

This is the format LeetCode uses to represent trees in test cases. Deserialization uses a queue: create the root, then for each node in the queue, read two tokens (left child, right child) and enqueue non-null children.

The Code

string serializeLevelOrder(TreeNode* root) {
    if (!root) return "#";
    string result;
    queue<TreeNode*> bfsQueue;
    bfsQueue.push(root);
    while (!bfsQueue.empty()) {
        TreeNode* node = bfsQueue.front();
        bfsQueue.pop();
        if (!node) {
            result += "#,";
            continue;
        }
        result += to_string(node->val) + ",";
        bfsQueue.push(node->left);
        bfsQueue.push(node->right);
    }
    result.pop_back();
    return result;
}

TreeNode* deserializeLevelOrder(string data) {
    if (data == "#") return nullptr;
    queue<string> tokens;
    stringstream stream(data);
    string token;
    while (getline(stream, token, ',')) {
        tokens.push(token);
    }
    TreeNode* root = new TreeNode(stoi(tokens.front()));
    tokens.pop();
    queue<TreeNode*> bfsQueue;
    bfsQueue.push(root);
    while (!bfsQueue.empty() && !tokens.empty()) {
        TreeNode* parent = bfsQueue.front();
        bfsQueue.pop();
        string leftToken = tokens.front();
        tokens.pop();
        if (leftToken != "#") {
            parent->left = new TreeNode(stoi(leftToken));
            bfsQueue.push(parent->left);
        }
        if (!tokens.empty()) {
            string rightToken = tokens.front();
            tokens.pop();
            if (rightToken != "#") {
                parent->right = new TreeNode(stoi(rightToken));
                bfsQueue.push(parent->right);
            }
        }
    }
    return root;
}

Complexity: \(O(n)\) time and space.


Level-order serialization writes nodes level by level with null markers

Preorder vs Level-Order: Trade-offs

Preorder Level-Order
Implementation Recursive, simpler Iterative, uses queue
String length Same — both encode \(n\) values + \((n+1)\) nulls Same
Streaming Can deserialize without knowing total size Need to read all tokens upfront
Matches LeetCode format No Yes

Both are valid. Interviewers typically expect the preorder approach because it's shorter to code.


Try It

The starter code has serialize and deserialize stubs. Implement them using preorder with null markers.

Input tree: [1, 2, 3, null, null, 4, 5]
Serialized: "1,2,#,#,3,4,#,#,5,#,#"
Deserialized and re-serialized: "1,2,#,#,3,4,#,#,5,#,#"

Predict before running: What does serialize(nullptr) return? Just "#". And deserialize("#") returns nullptr. The empty tree is a valid edge case.

Challenge: Negative values work fine — to_string(-5) gives "-5" and stoi("-5") gives -5. But what if a node's value is very large? stoi throws on overflow. For production code, you'd want bounds checking.

Edge Cases to Watch For

  • Very large or negative node values: to_string(-2147483648) produces "-2147483648" and stoi can throw on overflow if the value exceeds int range. Use stol or add bounds checking to avoid runtime exceptions.

Problems

Problem Link Difficulty
LC 297 — Serialize and Deserialize Binary Tree leetcode.com/problems/serialize-and-deserialize-binary-tree Hard
LC 449 — Serialize and Deserialize BST leetcode.com/problems/serialize-and-deserialize-bst Medium