Consistency Models & CAP

TL;DR

The CAP theorem says that during a network partition, a distributed system must choose between consistency and availability. In practice, you rarely choose one for the whole system -- you pick different consistency levels per feature based on the business requirement.

Three Librarians, One Book

Imagine three librarians working at different desks in a large library. When someone checks out a book at desk A, how quickly do desks B and C learn about it?

Strong consistency: They sync after every single checkout. If you walk to desk B immediately after someone checked out a book at desk A, desk B already knows. Slow but always correct.
Eventual consistency: They sync at the end of the day. For most patrons this is fine -- they browse the catalog, pick a book, and it's on the shelf. But if two people want the same rare book, one of them might walk to the shelf and find it gone despite the catalog saying it was available.

That gap between "always correct" and "correct eventually" is the core tension in distributed systems.

Consistency Spectrum

The CAP Theorem

In 2000, Eric Brewer conjectured (later proven by Gilbert and Lynch) that a distributed system can provide at most two out of three guarantees during a network partition:

Consistency (C): Every read receives the most recent write or an error.
Availability (A): Every request receives a non-error response (no guarantee it's the most recent).
Partition tolerance (P): The system continues operating despite network partitions between nodes.

Here's the catch: partition tolerance isn't optional. Networks fail. Packets get dropped. Switches die. You can't build a distributed system that simply ignores partitions. So the real choice is:

CP -- During a partition, reject requests rather than serve stale data. The system is consistent but not available.
AP -- During a partition, serve whatever data you have. The system is available but might return stale reads.

                 C (Consistency)
                /\
               /  \
              /    \
          CP /      \ CA (only works if
            /        \   no partitions)
           /          \
          /____________\
     P (Partition       A (Availability)
      Tolerance)
            AP

  During a partition, choose CP or AP.
  CA only exists in a single-node system.

The Consistency Spectrum

CAP frames consistency as binary, but real systems live on a spectrum:

Strong Consistency

All reads reflect the most recent write. If you write balance = \$500, every subsequent read from any node returns \$500.

How it works: Typically requires consensus protocols (Raft, Paxos) or synchronous replication. The write isn't acknowledged until a majority of nodes confirm it.

Used by: Banking systems, ticket booking, inventory management. Anywhere double-selling or double-spending is unacceptable.

Cost: Higher latency (must wait for quorum). Lower throughput. Cross-region replication adds 50-200ms per write.

Causal Consistency

Related events appear in the correct order. Unrelated events can be seen in any order.

Example: On a social platform, if Alice posts "I got the job!" and then Bob replies "Congrats!", causal consistency guarantees that no one ever sees Bob's reply without Alice's original post. But two unrelated posts by different users might appear in different orders on different replicas.

How it works: The system tracks causal dependencies between operations (often using vector clocks or logical timestamps).

Used by: Collaboration tools, social media comment threads, chat applications.

Read-Your-Writes Consistency

You always see your own updates immediately. Other users might see stale data for a short window.

Example: You update your profile bio on Instagram. When you refresh, you see the new bio instantly. Your friend across the country might see the old bio for a few seconds until replication catches up.

How it works: Route the user's reads to the same node that handled their writes (session affinity), or track the user's last write timestamp and ensure reads wait for replication to catch up to that point.

Used by: Social media profiles, user settings, shopping carts.

Eventual Consistency

All replicas converge to the same value over time, but there's no guarantee about how long that takes.

Example: You update a DNS record. It might take minutes to hours for all DNS servers worldwide to reflect the change. But eventually they all will.

How it works: Writes propagate asynchronously to all replicas. No synchronization required at write time.

Used by: DNS, CDN caches, social media feeds, product catalogs.

Cost: Cheapest in terms of latency and throughput. You pay with temporary inconsistency.

Strong ←————————————————————————————→ Eventual
  |          |              |              |
  |     Causal      Read-your-writes       |
  |                                        |
Slowest,                            Fastest,
most correct                     temporarily stale

PACELC: The Full Picture

CAP only tells half the story -- it's about what happens during partitions. But what about normal operations when everything is fine?

The PACELC theorem extends CAP:

If there is a Partition, choose Availability or Consistency. Else (no partition), choose Latency or Consistency.

Even when the network is healthy, there's still a trade-off. Synchronous replication gives you consistency but adds latency. Asynchronous replication gives you low latency but risks stale reads.

System	During Partition (PAC)	Normal Operation (ELC)
Google Spanner	PC (consistency)	EC (consistency, accepts latency)
DynamoDB (default)	PA (availability)	EL (low latency)
Cassandra (QUORUM)	PC (consistency)	EC (consistency)
Cassandra (ONE)	PA (availability)	EL (low latency)

ACID vs BASE

These two acronyms represent the two ends of the consistency spectrum applied to database transactions:

	ACID	BASE
Stands for	Atomicity, Consistency, Isolation, Durability	Basically Available, Soft state, Eventually consistent
Consistency	Strong -- transactions are all-or-nothing	Eventual -- replicas converge over time
Availability	May reject writes to preserve consistency	Prioritizes availability, tolerates stale reads
Scale model	Vertical (bigger machine) or careful sharding	Horizontal (add more nodes)
Best for	Financial transactions, inventory, bookings	Social feeds, product catalogs, analytics
Examples	PostgreSQL, MySQL, Spanner	DynamoDB, Cassandra, CouchDB

Mixed Consistency: Design Per Feature, Not Per System

Here's what experienced engineers know: you don't pick one consistency model for your entire system. You pick different models for different features based on their requirements.

Take Ticketmaster:

Seat availability for browsing: Eventual consistency is fine. If the page shows 50 seats available and the true number is 48, nobody cares. The user hasn't committed to anything yet.
Seat booking: Strong consistency is mandatory. Two users cannot book the same seat. This path uses distributed locks or serializable transactions.
Order confirmation emails: Eventual consistency. The email can arrive a few seconds after the booking. No one notices.

Ticketmaster feature map:

  Browse seats ──────→ Eventual (AP, fast, cheap)
  Book a seat ───────→ Strong (CP, slow, correct)
  Process payment ───→ Strong (CP, cannot double-charge)
  Send confirmation ─→ Eventual (AP, async is fine)
  Show order history ─→ Read-your-writes (you see your booking)

Uber does the same thing. Driver location updates are eventually consistent (GPS coordinates are approximate anyway). But the ride assignment -- matching a rider to a driver -- requires strong consistency to prevent double-assignment.

Real-World Database Examples

DynamoDB

Eventual consistency by default. Every read goes to a random replica, which might be slightly behind. You can opt into strongly consistent reads per request (at 2x the cost and higher latency). This lets you mix consistency levels per query.

Google Spanner

Globally distributed with strong consistency using TrueTime (synchronized atomic clocks in every data center). Writes have higher latency (~10-15ms for cross-region) but reads are always consistent. Used by Google Ads, Google Play -- systems where correctness matters more than raw speed.

Cassandra

Tunable consistency per query. You set how many replicas must respond:

ONE -- fastest, least consistent (only one replica confirms)
QUORUM -- majority of replicas must confirm (balances speed and consistency)
ALL -- all replicas must confirm (strongest, slowest)

This tuning happens at the query level, not the cluster level:

-- Fast browse query (eventual)
SELECT * FROM products WHERE id = ? USING CONSISTENCY ONE;

-- Critical booking query (strong)
INSERT INTO reservations (...) USING CONSISTENCY QUORUM;

Interview Tip

When designing a distributed system in an interview, don't just say "I'll use eventual consistency." Map each feature to its consistency requirement and explain why. Saying "Browse uses eventual consistency because a stale catalog is acceptable, but checkout uses strong consistency because we can't double-sell inventory" shows that you understand the trade-offs at a granular level.

Quick Recap

Model	Guarantee	Latency	Use Case
Strong	Reads always reflect latest write	Highest	Banking, booking, inventory
Causal	Related events in correct order	Medium	Comment threads, chat
Read-your-writes	You see your own updates	Medium	User profiles, settings
Eventual	All replicas converge over time	Lowest	DNS, feeds, catalogs

The goal isn't to pick the "best" consistency model. It's to pick the cheapest one that still satisfies the correctness requirements of each feature.

Further reading: For a quick-reference summary, see the standalone CAP Theorem, ACID vs BASE, and Replication Strategies articles.