Custom Comparators
Beyond Simple Sorting
The previous lesson sorted by a single field: \(a_i\) ascending, or \(B_i\) ascending. Clean and simple.
Some problems don't have a single field to sort by. The comparison depends on both elements at once. You need to derive the comparator from scratch using the exchange argument.
Here's the key question you always ask: when should element A come before element B in the optimal ordering?
The Problem That Requires Derivation
CF632C — The Smallest String Concatenation: Given n strings, concatenate them in some order to get the lexicographically smallest possible result.
The Greedy Choice: Put string
abeforebifa + b < b + alexicographically. The comparator falls directly from the exchange argument.
There's no obvious "sort by length" or "sort by first character" that works in general. You need to derive the sort order from the exchange argument.
Setup: Consider two strings a and b. In what order should they appear?
- Order 1 (a before b): result starts with
a + b - Order 2 (b before a): result starts with
b + a
We want the lexicographically smallest result. So put a before b if a + b < b + a (lexicographic comparison of the concatenated strings).
This is the comparator. It fell directly from the exchange argument.
Watching the Comparator Work

Let's trace on ["ba", "a", "bc"].
Compare all pairs using a + b < b + a:
| Pair | a+b | b+a | Who's first? |
|---|---|---|---|
| "ba" vs "a" | "baa" | "aba" | "aba" < "baa" → a first |
| "ba" vs "bc" | "babc" | "bcba" | "babc" < "bcba" → ba first |
| "a" vs "bc" | "abc" | "bca" | "abc" < "bca" → a first |
Sorted order: "a" < "ba" < "bc" → concatenation: "ababc".
Let's verify manually: - "a" + "ba" + "bc" = "ababc" - "a" + "bc" + "ba" = "abcba" — larger - "ba" + "a" + "bc" = "baabc" — larger - All other orders are also larger
The greedy sort gives the lexicographically smallest result.
The Transitivity Requirement
Here's something easy to miss. For a comparator to work with std::sort, it must be a strict weak order — in particular, it must be transitive: if A < B and B < C, then A < C.
Is our comparator transitive? I.e., if a+b < b+a and b+c < c+b, does a+c < c+a?
Proof: Treat each string as a number in base \(10^{|s|}\) (a "superdigit"). Then the concatenation a+b represents the number \(a \times 10^{|b|} + b\), and the condition a+b < b+a means:
Rearranging:
Which gives us a ratio comparison:
Ratios compose transitively. If \(\frac{a}{\ldots} < \frac{b}{\ldots}\) and \(\frac{b}{\ldots} < \frac{c}{\ldots}\), then \(\frac{a}{\ldots} < \frac{c}{\ldots}\). Transitivity holds.
What breaks if the comparator isn't transitive? std::sort enters undefined behavior — it can produce any ordering, crash, or infinite loop. Always verify transitivity before using a custom comparator.
The Cross-Product Pattern
CF922D — Robot Vacuum Cleaner: Each "robot" is a string of 's' and 'h' characters. Concatenate all robots in some order. Maximize the total number of "sh" subsequences in the result.
An "sh" subsequence is any pair of positions where s appears before h (not necessarily adjacent).
Setup the exchange argument: For two robots A and B, compute the contribution from the cross-term (pairs where one character comes from A and one from B).
- A before B: cross contribution = \(A_s \times B_h\) (every
sin A pairs with everyhin B) - B before A: cross contribution = \(B_s \times A_h\)
Put A before B if \(A_s \times B_h > B_s \times A_h\).
This is a cross-product comparator — it compares the ratio \(s/h\) for each string, without actually dividing (to avoid floating point issues). Equivalently, A goes first if \(A_s / A_h > B_s / B_h\) (higher s-to-h ratio goes first).
Why? If A has many ss and few hs, and B has many hs, putting A first means all of A's ss can pair with all of B's hs — maximum cross-pairs. If you put B first instead, B's hs come before A's ss and those pairs are lost.
Let's trace on strings ["sh", "hs", "s", "h"]:
Compute \((s\_count, h\_count)\) for each: - "sh": s=1, h=1 - "hs": s=1, h=1 - "s": s=1, h=0 - "h": s=0, h=1
Cross-product comparison: - "s" (1,0) vs anything with h: \(1 \times h > 0 \times 1 = 0\) — "s" always goes first (h=0 means ratio is \(\infty\)) - "h" (0,1) vs anything with s: \(0 \times s < s_{other} \times 1\) — "h" always goes last - "sh" (1,1) vs "hs" (1,1): \(1 \times 1 = 1 \times 1\) — tie, either order
Optimal order: "s" first, then "sh" and "hs" in any order, then "h" last.
Counting "sh" subsequences in "s" + "sh" + "h" = "sshh": - (0,2): s at 0, h at 2 ✓ - (0,3): s at 0, h at 3 ✓ - (1,2): s at 1, h at 2 ✓ - (1,3): s at 1, h at 3 ✓ - Total: 4
Counting "sh" subsequences in the worst order "h" + "sh" + "s" = "hshs": - Characters: h(0), s(1), h(2), s(3) - (1,2): s at 1, h at 2 ✓ - Total: 1 — much worse.
Deriving Comparators: The General Template
- Consider two adjacent elements A and B in your sequence.
- Define \(\text{cost}(A, B)\) = some function of putting A before B.
- Define \(\text{cost}(B, A)\) = same function with B before A.
- Put A before B if \(\text{cost}(A, B)\) is better (smaller for minimization, larger for maximization).
- Simplify the comparison to avoid floating point.
- Verify transitivity.
| Problem | cost(A before B) | Comparator |
|---|---|---|
| String concat | a + b (lex) |
a + b < b + a |
| Max "sh" subseq | \(A_s \times B_h\) (cross-pairs gained) | \(A_s \cdot B_h > B_s \cdot A_h\) |
| Tasks & Deadlines | \(2t + 2a_i + a_j\) (sum finish times) | \(a_i < a_j\) |
The Bug Factory: Strict vs Non-Strict
A common mistake: writing a + b <= b + a instead of a + b < b + a. This creates a comparator that returns true for equal elements (a < a = true), violating the strict weak order requirement. std::sort will produce garbage.
Always use strict <, not <=.
If two elements are equal under your comparator (both a+b == b+a and b+a == a+b), the comparator returns false for both, which means they're "equivalent" — and std::sort handles ties arbitrarily but correctly.
The Full Code
#include <bits/stdc++.h>
using namespace std;
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int numStrings;
cin >> numStrings;
vector<string> stringElements(numStrings);
for (int i = 0; i < numStrings; i++) {
cin >> stringElements[i];
}
auto concatCompare = [](const string& a, const string& b) {
return a + b < b + a;
};
sort(stringElements.begin(), stringElements.end(), concatCompare);
for (const auto& str : stringElements) {
cout << str;
}
cout << "\n";
return 0;
}
Performance note: a + b creates a new string for every comparison. With n=50,000 strings of total length 50,000, this is O(n log n) comparisons × O(avg_length) per comparison = fine in practice, but be aware.
Try It
Fill in the sorting lambda.
Trace: compare "c" vs "ba": "cba" vs "bac" → "bac" < "cba" → ba first. Compare "ba" vs "b": "bab" vs "bba" → "bab" < "bba" → b first. So order: b, ba, c → "bbac". ✓
These are numbers written as strings. Compare all pairs using a+b < b+a:
- "3" vs "30": "330" vs "303" → "303" < "330" → 30 before 3
- "30" vs "34": "3034" vs "3430" → "3034" < "3430" → 30 before 34
- "34" vs "5": "345" vs "534" → "345" < "534" → 34 before 5
- "3" vs "34": "334" vs "343" → "334" < "343" → 3 before 34
- "3" vs "5": "35" vs "53" → "35" < "53" → 3 before 5
Sorted order: 30, 3, 34, 5 → "303345". ✓
Challenge: Write the cross-product sort for the Robot Vacuum Cleaner and verify on a small example by hand.
Problems
| Problem | Link | Difficulty |
|---|---|---|
| CF632C — Smallest String Concatenation | codeforces.com/problemset/problem/632/C | 1700 |
| CF922D — Robot Vacuum Cleaner | codeforces.com/problemset/problem/922/D | 1800 |