Concurrency Control & Isolation Levels
TL;DR
When multiple transactions run simultaneously, race conditions can corrupt data. Isolation levels define how much transactions see each other's work. MVCC lets readers and writers coexist without blocking. For interview-critical scenarios like "prevent double-booking," know SELECT FOR UPDATE and optimistic vs pessimistic locking.
The Concert Ticket Problem
Two users try to buy the last concert ticket at the exact same moment. Both check the database: "Is there a ticket available?" Both see "yes." Both proceed to buy. Now you've sold one ticket to two people.
This isn't a theoretical concern. It's the most common interview question about concurrency: "How do you prevent double-booking?" And the answer depends on understanding isolation levels and locking.
Why Concurrency Is Hard
When transactions run one at a time (serially), everything is simple. Transaction A finishes, then Transaction B starts. No conflicts possible.
But serial execution is slow. Modern databases run transactions concurrently — overlapping in time — to maximize throughput. The database's job is to make this concurrent execution appear as if transactions ran serially. How well it achieves this is defined by the isolation level.
The Four Isolation Levels
SQL defines four isolation levels, from weakest to strongest. Each prevents more types of concurrency bugs, but costs more performance.
Read Uncommitted (Weakest — Rarely Used)
A transaction can read another transaction's uncommitted writes (dirty reads). Almost never used because it's unsafe — if the other transaction rolls back, you read data that never existed.
Read Committed (The Practical Default)
A transaction only sees data that has been committed. No dirty reads.
Most databases default to this level (PostgreSQL, Oracle, SQL Server). It prevents dirty reads and dirty writes, but not all race conditions.
What it doesn't prevent:
Time Transaction A Transaction B
─────────────────────────────────────────────────
T1 SELECT balance → $1000
T2 SELECT balance → $1000
T3 UPDATE balance = $900
T4 COMMIT
T5 UPDATE balance = $900 ← lost A's update!
T6 COMMIT
Both transactions read $1000, both subtract $100, both write $900. The result should be $800 but it's $900. Transaction A's update was lost. This is the lost update problem.
Repeatable Read / Snapshot Isolation
A transaction sees a consistent snapshot of the database from the moment it started. Even if other transactions commit changes, this transaction keeps seeing the old data.
PostgreSQL's "Repeatable Read" is actually snapshot isolation (implemented with MVCC — see below).
This prevents lost updates in most cases, and prevents read skew (seeing inconsistent data across multiple reads within one transaction).
Serializable (Strongest)
Transactions behave as if they executed one at a time. No concurrency anomalies possible. The database achieves this through actual serial execution, two-phase locking, or serializable snapshot isolation (SSI).
The cost: Significantly slower. Transactions may be aborted and retried if the database detects conflicts.
Comparison
| Level | Dirty Reads | Lost Updates | Read Skew | Write Skew | Performance |
|---|---|---|---|---|---|
| Read Uncommitted | Possible | Possible | Possible | Possible | Fastest |
| Read Committed | Prevented | Possible | Possible | Possible | Fast |
| Repeatable Read | Prevented | Mostly prevented | Prevented | Possible | Medium |
| Serializable | Prevented | Prevented | Prevented | Prevented | Slowest |
MVCC — Multi-Version Concurrency Control
MVCC is the mechanism that makes snapshot isolation possible. Instead of locking rows when reading, the database keeps multiple versions of each row.
Row versions for user_id = 42:
┌──────────┬──────────┬────────────┬────────────┐
│ version │ balance │ created_by │ visible_to │
├──────────┼──────────┼────────────┼────────────┤
│ 1 │ $1000 │ txn_100 │ txn < 200 │
│ 2 │ $900 │ txn_150 │ txn < 300 │
│ 3 │ $800 │ txn_250 │ txn ≥ 250 │ ← current
└──────────┴──────────┴────────────┴────────────┘
When Transaction 200 reads this row, it sees version 2 ($900) — the latest version visible to it. Transaction 300 sees version 3 ($800). Neither blocks the other. Readers never block writers, writers never block readers.
This is why PostgreSQL can handle thousands of concurrent reads without locking. Old versions are cleaned up by the VACUUM process after all transactions that might need them have finished.
Key insight from DDIA: "Snapshot isolation is a boon for long-running, read-only queries such as backups and analytics. It is very hard to reason about data if it keeps changing while a query is running."
Solving the Concert Ticket Problem
Back to our double-booking scenario. There are two approaches:
Pessimistic vs Optimistic Locking
Lock the row before reading it. No other transaction can read or modify it until you're done.
BEGIN;
-- Lock the ticket row — other transactions wait here
SELECT * FROM tickets
WHERE event_id = 42 AND status = 'available'
LIMIT 1
FOR UPDATE;
-- If we got a row, book it
UPDATE tickets SET status = 'sold', buyer_id = 123
WHERE id = <selected_ticket_id>;
COMMIT;
FOR UPDATE acquires an exclusive lock on the selected rows. If Transaction B tries to SELECT FOR UPDATE the same row, it blocks until Transaction A commits or rolls back.
Pros: Simple, correct, impossible to double-book. Cons: Transactions queue up behind each other (reduced throughput). Risk of deadlocks if multiple rows are locked in different orders.
Don't lock anything. Let both transactions proceed. At commit time, check if the data changed since you read it.
-- Transaction A reads
SELECT * FROM tickets WHERE id = 1; -- status = 'available', version = 5
-- Transaction A updates (only if version hasn't changed)
UPDATE tickets
SET status = 'sold', buyer_id = 123, version = 6
WHERE id = 1 AND version = 5;
-- Returns "1 row updated" → success
-- Transaction B tries the same
UPDATE tickets
SET status = 'sold', buyer_id = 456, version = 6
WHERE id = 1 AND version = 5;
-- Returns "0 rows updated" → version changed, retry!
The WHERE version = 5 is the optimistic lock. If someone else changed the row (incrementing the version), your update matches zero rows — you know to retry.
Pros: No blocking. Higher throughput under low contention. Cons: Under high contention, many transactions fail and retry (wasted work). Application must handle retries.
When to Use Which
| Scenario | Approach | Why |
|---|---|---|
| Last ticket, high contention | Pessimistic (SELECT FOR UPDATE) | Can't afford retries — every failure is a bad user experience |
| Editing a document, rare conflicts | Optimistic (version check) | Conflicts are rare, so retries are cheap |
| Financial transfer | Pessimistic or Serializable isolation | Correctness is non-negotiable |
| Shopping cart updates | Optimistic | Low contention, easy to retry |
Write Skew — The Subtle Race Condition
Write skew occurs when two transactions read the same data, make decisions based on it, and write to different rows — but the combined result violates a constraint.
Example: A hospital requires at least one doctor on call. Two doctors are on call. Both check "is there more than one doctor on call?" — both see "yes" — both remove themselves. Now zero doctors are on call.
Time Doctor A Doctor B
────────────────────────────────────────────────────
T1 SELECT COUNT(*) → 2
T2 SELECT COUNT(*) → 2
T3 UPDATE SET on_call=false
T4 UPDATE SET on_call=false
Result: 0 doctors on call!
Neither transaction wrote to the same row, so row-level locks don't help. The constraint spans multiple rows.
Solutions:
- Serializable isolation: The database detects the conflict and aborts one transaction
- Materialized conflict: Add a lock row that both transactions must acquire (SELECT ... FOR UPDATE on a shared "on_call_lock" row)
- Application-level check: Verify the constraint after the write and roll back if violated
Interview Tip
The concert ticket problem is the most common concurrency question in system design interviews. Explain both approaches: "For high-contention scenarios like last-ticket booking, I'd use pessimistic locking with SELECT FOR UPDATE. For low-contention scenarios like profile updates, I'd use optimistic locking with a version column to avoid blocking." Then mention that distributed systems need even stronger mechanisms like distributed locks or Saga patterns.
Quick Recap
| Concept | What It Does | When to Use |
|---|---|---|
| Read Committed | No dirty reads/writes | Default isolation for most apps |
| Snapshot Isolation (MVCC) | Consistent point-in-time reads | Long-running queries, analytics, backups |
| Serializable | No concurrency anomalies at all | Financial transactions, critical constraints |
| SELECT FOR UPDATE | Lock rows before reading | Prevent double-booking, high-contention writes |
| Optimistic locking | Version check at write time | Low-contention updates, document editing |
| Write skew | Two transactions violate a cross-row constraint | Requires serializable isolation or materialized conflicts |