Concurrency Control & Isolation Levels

TL;DR

When multiple transactions run simultaneously, race conditions can corrupt data. Isolation levels define how much transactions see each other's work. MVCC lets readers and writers coexist without blocking. For interview-critical scenarios like "prevent double-booking," know SELECT FOR UPDATE and optimistic vs pessimistic locking.

The Concert Ticket Problem

Two users try to buy the last concert ticket at the exact same moment. Both check the database: "Is there a ticket available?" Both see "yes." Both proceed to buy. Now you've sold one ticket to two people.

This isn't a theoretical concern. It's the most common interview question about concurrency: "How do you prevent double-booking?" And the answer depends on understanding isolation levels and locking.

Why Concurrency Is Hard

When transactions run one at a time (serially), everything is simple. Transaction A finishes, then Transaction B starts. No conflicts possible.

But serial execution is slow. Modern databases run transactions concurrently — overlapping in time — to maximize throughput. The database's job is to make this concurrent execution appear as if transactions ran serially. How well it achieves this is defined by the isolation level.

The Four Isolation Levels

SQL defines four isolation levels, from weakest to strongest. Each prevents more types of concurrency bugs, but costs more performance.

Read Uncommitted (Weakest — Rarely Used)

A transaction can read another transaction's uncommitted writes (dirty reads). Almost never used because it's unsafe — if the other transaction rolls back, you read data that never existed.

Read Committed (The Practical Default)

A transaction only sees data that has been committed. No dirty reads.

Most databases default to this level (PostgreSQL, Oracle, SQL Server). It prevents dirty reads and dirty writes, but not all race conditions.

What it doesn't prevent:

Time     Transaction A              Transaction B
─────────────────────────────────────────────────
T1       SELECT balance → $1000
T2                                  SELECT balance → $1000
T3       UPDATE balance = $900
T4       COMMIT
T5                                  UPDATE balance = $900  ← lost A's update!
T6                                  COMMIT

Both transactions read $1000, both subtract $100, both write $900. The result should be $800 but it's $900. Transaction A's update was lost. This is the lost update problem.

Repeatable Read / Snapshot Isolation

A transaction sees a consistent snapshot of the database from the moment it started. Even if other transactions commit changes, this transaction keeps seeing the old data.

PostgreSQL's "Repeatable Read" is actually snapshot isolation (implemented with MVCC — see below).

This prevents lost updates in most cases, and prevents read skew (seeing inconsistent data across multiple reads within one transaction).

Serializable (Strongest)

Transactions behave as if they executed one at a time. No concurrency anomalies possible. The database achieves this through actual serial execution, two-phase locking, or serializable snapshot isolation (SSI).

The cost: Significantly slower. Transactions may be aborted and retried if the database detects conflicts.

Comparison

Level	Dirty Reads	Lost Updates	Read Skew	Write Skew	Performance
Read Uncommitted	Possible	Possible	Possible	Possible	Fastest
Read Committed	Prevented	Possible	Possible	Possible	Fast
Repeatable Read	Prevented	Mostly prevented	Prevented	Possible	Medium
Serializable	Prevented	Prevented	Prevented	Prevented	Slowest

MVCC — Multi-Version Concurrency Control

MVCC is the mechanism that makes snapshot isolation possible. Instead of locking rows when reading, the database keeps multiple versions of each row.

Row versions for user_id = 42:
┌──────────┬──────────┬────────────┬────────────┐
│ version  │ balance  │ created_by │ visible_to │
├──────────┼──────────┼────────────┼────────────┤
│    1     │  $1000   │  txn_100   │ txn < 200  │
│    2     │   $900   │  txn_150   │ txn < 300  │
│    3     │   $800   │  txn_250   │ txn ≥ 250  │  ← current
└──────────┴──────────┴────────────┴────────────┘

When Transaction 200 reads this row, it sees version 2 ($900) — the latest version visible to it. Transaction 300 sees version 3 ($800). Neither blocks the other. Readers never block writers, writers never block readers.

This is why PostgreSQL can handle thousands of concurrent reads without locking. Old versions are cleaned up by the VACUUM process after all transactions that might need them have finished.

Key insight from DDIA: "Snapshot isolation is a boon for long-running, read-only queries such as backups and analytics. It is very hard to reason about data if it keeps changing while a query is running."

Solving the Concert Ticket Problem

Back to our double-booking scenario. There are two approaches:

Pessimistic vs Optimistic Locking

Pessimistic (SELECT FOR UPDATE)Optimistic (Version Check)

Lock the row before reading it. No other transaction can read or modify it until you're done.

BEGIN;

-- Lock the ticket row — other transactions wait here
SELECT * FROM tickets
WHERE event_id = 42 AND status = 'available'
LIMIT 1
FOR UPDATE;

-- If we got a row, book it
UPDATE tickets SET status = 'sold', buyer_id = 123
WHERE id = <selected_ticket_id>;

COMMIT;

FOR UPDATE acquires an exclusive lock on the selected rows. If Transaction B tries to SELECT FOR UPDATE the same row, it blocks until Transaction A commits or rolls back.

Pros: Simple, correct, impossible to double-book. Cons: Transactions queue up behind each other (reduced throughput). Risk of deadlocks if multiple rows are locked in different orders.

Don't lock anything. Let both transactions proceed. At commit time, check if the data changed since you read it.

-- Transaction A reads
SELECT * FROM tickets WHERE id = 1;  -- status = 'available', version = 5

-- Transaction A updates (only if version hasn't changed)
UPDATE tickets
SET status = 'sold', buyer_id = 123, version = 6
WHERE id = 1 AND version = 5;
-- Returns "1 row updated" → success

-- Transaction B tries the same
UPDATE tickets
SET status = 'sold', buyer_id = 456, version = 6
WHERE id = 1 AND version = 5;
-- Returns "0 rows updated" → version changed, retry!

The WHERE version = 5 is the optimistic lock. If someone else changed the row (incrementing the version), your update matches zero rows — you know to retry.

Pros: No blocking. Higher throughput under low contention. Cons: Under high contention, many transactions fail and retry (wasted work). Application must handle retries.

When to Use Which

Scenario	Approach	Why
Last ticket, high contention	Pessimistic (SELECT FOR UPDATE)	Can't afford retries — every failure is a bad user experience
Editing a document, rare conflicts	Optimistic (version check)	Conflicts are rare, so retries are cheap
Financial transfer	Pessimistic or Serializable isolation	Correctness is non-negotiable
Shopping cart updates	Optimistic	Low contention, easy to retry

Write Skew — The Subtle Race Condition

Write skew occurs when two transactions read the same data, make decisions based on it, and write to different rows — but the combined result violates a constraint.

Example: A hospital requires at least one doctor on call. Two doctors are on call. Both check "is there more than one doctor on call?" — both see "yes" — both remove themselves. Now zero doctors are on call.

Time     Doctor A                    Doctor B
────────────────────────────────────────────────────
T1       SELECT COUNT(*) → 2
T2                                   SELECT COUNT(*) → 2
T3       UPDATE SET on_call=false
T4                                   UPDATE SET on_call=false
         Result: 0 doctors on call!

Neither transaction wrote to the same row, so row-level locks don't help. The constraint spans multiple rows.

Solutions: - Serializable isolation: The database detects the conflict and aborts one transaction - Materialized conflict: Add a lock row that both transactions must acquire (SELECT ... FOR UPDATE on a shared "on_call_lock" row) - Application-level check: Verify the constraint after the write and roll back if violated

Interview Tip

The concert ticket problem is the most common concurrency question in system design interviews. Explain both approaches: "For high-contention scenarios like last-ticket booking, I'd use pessimistic locking with SELECT FOR UPDATE. For low-contention scenarios like profile updates, I'd use optimistic locking with a version column to avoid blocking." Then mention that distributed systems need even stronger mechanisms like distributed locks or Saga patterns.

Quick Recap

Concept	What It Does	When to Use
Read Committed	No dirty reads/writes	Default isolation for most apps
Snapshot Isolation (MVCC)	Consistent point-in-time reads	Long-running queries, analytics, backups
Serializable	No concurrency anomalies at all	Financial transactions, critical constraints
SELECT FOR UPDATE	Lock rows before reading	Prevent double-booking, high-contention writes
Optimistic locking	Version check at write time	Low-contention updates, document editing
Write skew	Two transactions violate a cross-row constraint	Requires serializable isolation or materialized conflicts