Design a Ride-Sharing Platform

TL;DR

A ride-sharing platform matches riders with nearby drivers in real-time, tracks ride progress, and handles payments. The hard parts are geospatial indexing for driver discovery (why Uber chose H3 hexagonal grids over geohashes), handling 1.25 million location updates per second from active drivers, batched bipartite matching (why greedy nearest-driver assignment is suboptimal), and managing the ride lifecycle through a state machine backed by durable execution (Temporal/Cadence). Surge pricing is not just economics -- it is a geospatial analytics problem computed per H3 cell. The financial clearing system has parallels to DTCC stock trade settlement, and getting it wrong means drivers do not get paid.

The System

A ride-sharing platform connects riders who need transportation with drivers who provide it. The rider opens the app, enters a destination, sees a price estimate, and requests a ride. The system finds a nearby available driver, sends them the ride request, and if the driver accepts, tracks the ride from pickup to dropoff, then processes payment.

Uber completes approximately 28 million rides per day across 70+ countries. At peak (New Year's Eve, major events), the system handles 5+ million simultaneous rides. Lyft processes 1+ million rides per day. Uber's engineering blog has disclosed that their dispatch system handles 1.25 million location updates per second from active drivers, their matching algorithm evaluates thousands of possible rider-driver pairs simultaneously, and their surge pricing model recalculates dynamically across hundreds of thousands of geographic cells. The financial stakes are enormous: Uber's gross bookings exceed $130 billion annually, and even a 0.1% error in fare calculation means $130 million in mischarges.

Requirements

Functional

Ride request: Rider enters pickup and destination, sees price estimate, and requests a ride
Driver matching: System finds the nearest available driver and offers the ride request
Real-time tracking: Both rider and driver see each other's location on a live map during pickup and ride
Ride lifecycle management: Track ride through states: requested -> matched -> driver_en_route -> arrived -> trip_started -> trip_completed -> payment_processed
Fare calculation: Compute fare based on distance, time, base fare, and surge multiplier
Surge pricing: Dynamically adjust prices based on local supply/demand ratio
Payment processing: Charge rider, pay driver (minus commission), handle tips

Non-Functional

Matching latency: Find a driver within 10 seconds of ride request (p95)
Location update throughput: Handle 1.25M GPS updates/sec from active drivers
Location freshness: Driver location in the matching system is no more than 4 seconds stale
Trip state durability: A ride in progress must not be lost even if servers crash mid-ride
ETA accuracy: Estimated time of arrival accurate within 2 minutes
Availability: 99.99% uptime for the ride request and dispatch path

Back-of-Envelope Math

Active drivers at peak:
  1.25M location updates/sec, each driver sends ~1 update/sec
  -> ~1.25M concurrently active drivers

Ride requests:
  28M rides/day = 324 rides/sec average
  Peak (NYE in NYC): 10x = 3,240 rides/sec

Location update processing:
  Each update: driver_id (8B) + lat (8B) + lng (8B) + timestamp (8B) + 
               status (1B) + heading (4B) + speed (4B) = ~41 bytes
  1.25M * 41 bytes = 51.25 MB/sec ingestion bandwidth

Geospatial index:
  1.25M driver locations to index and query
  Each matching request: find drivers within 5 km radius
  At 324 requests/sec: 324 geospatial queries/sec, each returning 10-50 candidates

Ride state storage:
  Active rides at peak: ~5M
  Each ride record: ~500 bytes
  Active ride data: 5M * 500B = 2.5 GB (fits in memory/Redis)

Financial transactions:
  28M rides/day * 2 transactions (charge rider + pay driver) = 56M/day
  = 648 transactions/sec

The Naive Design

┌──────────┐     ┌──────────────┐     ┌──────────────┐
│  Rider   │────>│  API Server  │────>│  PostgreSQL  │
│  App     │<────│  (monolith)  │<────│  + PostGIS   │
└──────────┘     └──────────────┘     └──────────────┘
                        │
                  ┌─────v─────┐
                  │  Driver   │
                  │  App      │
                  └───────────┘

-- Driver sends location:
UPDATE drivers SET location = ST_MakePoint(lng, lat) WHERE id = ?;

-- Find nearest driver:
SELECT id, ST_Distance(location, ST_MakePoint(rider_lng, rider_lat)) as distance
FROM drivers
WHERE status = 'available'
AND ST_DWithin(location, ST_MakePoint(rider_lng, rider_lat), 5000)  -- 5km
ORDER BY distance
LIMIT 1;

PostGIS with R-tree spatial index. For a ride-sharing app in one city with 100 drivers, this works perfectly.

Where Does This Break First?

The location updates. At 1.25M updates/sec, each UPDATE triggers an index rebuild on the spatial column. PostgreSQL's GiST index is not designed for 1.25M mutations per second. Even with batching, the write-heavy geospatial workload overwhelms a relational database. You need an in-memory spatial index that handles millions of updates per second.

Where It Breaks

Problem 1: Geospatial index mutations at 1.25M/sec. Relational database spatial indexes (R-tree, GiST) are optimized for reads with occasional writes. At 1.25M writes/sec, the index rebuild time dominates. You need an in-memory index that supports O(1) location updates.

Problem 2: Nearest-driver matching is suboptimal. Greedy matching (assign the nearest driver to each rider) seems optimal but is not. If Rider A's nearest driver is Driver 1, and Rider B's nearest driver is also Driver 1, greedy assigns Driver 1 to whichever request came first. But the globally optimal assignment might be: A gets Driver 2 (2nd nearest) and B gets Driver 1, minimizing total wait time. This is a bipartite matching problem.

Problem 3: Ride state management across failures. A ride goes through 6-8 state transitions over 20-30 minutes. If the server handling a ride crashes mid-trip, the ride must resume from its last state on another server. This requires durable state -- either a database transaction log or a workflow engine.

Problem 4: Surge pricing requires real-time geospatial analytics. Supply (available drivers) and demand (ride requests) must be computed per geographic cell, updated every 30-60 seconds, and fed into a pricing model. This is a streaming analytics problem layered on top of the geospatial system.

The Real Design

┌──────────────────────────────────────────────────────────────────┐
│                         Client Layer                              │
│  ┌──────────┐                                    ┌──────────────┐│
│  │ Rider App│──── REST/GraphQL ────┐    ┌────────│ Driver App   ││
│  └──────────┘                      │    │        └──────────────┘│
│                                    │    │ WebSocket (location)   │
└────────────────────────────────────┼────┼────────────────────────┘
                                     │    │
                              ┌──────v────v──────┐
                              │   API Gateway    │
                              └──────┬───────────┘
                                     │
          ┌──────────────────────────┼──────────────────────────┐
          │                         │                          │
 ┌────────v──────┐    ┌─────────────v──────┐    ┌──────────────v──┐
 │  Ride Service │    │  Location Service  │    │  Pricing Service│
 │  (state       │    │  (in-memory        │    │  (surge calc)   │
 │   machine)    │    │   geospatial index)│    │                 │
 └────────┬──────┘    └─────────────┬──────┘    └─────────────────┘
          │                         │
 ┌────────v──────┐    ┌─────────────v──────┐
 │  Matching     │    │  H3 Grid Index     │
 │  Service      │    │  (driver locations)│
 │  (bipartite)  │    │                    │
 └───────────────┘    └────────────────────┘
          │
 ┌────────v──────────────────────────────────────────┐
 │              Payment / Settlement Service          │
 │  (Temporal workflow for ride lifecycle)            │
 └───────────────────────────────────────────────────┘

H3 Hexagonal Grid (Why Hexagons)

Uber developed H3, a hexagonal hierarchical geospatial indexing system, and open-sourced it. Here is why hexagons beat squares (geohash) and triangles.

Problem with geohash squares: Geohashes divide the world into rectangular cells. But the distance from the center of a square to its edge varies -- the corner is 41% farther than the midpoint of a side. This means "all drivers in this cell" returns drivers at very different distances. Worse, geohash cells at different resolutions do not nest cleanly -- a parent cell's children do not perfectly tile it.

Why hexagons are better:

Square cell:
  Distance to center from edge midpoint: d
  Distance to center from corner: d * sqrt(2) = 1.41d
  Variance: 41%

Hexagonal cell:
  Distance to center from any edge midpoint: d  
  Distance to center from any vertex: d * 2/sqrt(3) = 1.15d
  Variance: 15%

Hexagons have the most uniform distance from center to boundary of any regular polygon that tiles a plane. This means "all drivers in this H3 cell" returns a set of drivers at approximately equal distance from the cell center.

H3 resolution levels:

Resolution	Average edge (km)	Average area (km^2)	Use case
7	1.22	5.16	Surge pricing zones
8	0.46	0.74	Driver supply aggregation
9	0.17	0.11	Precise driver matching

Driver location indexing with H3:

import h3

def update_driver_location(driver_id, lat, lng):
    # Convert GPS to H3 cell at resolution 9
    cell = h3.latlng_to_cell(lat, lng, 9)

    # Remove from previous cell, add to new cell
    old_cell = driver_cells.get(driver_id)
    if old_cell and old_cell != cell:
        cell_index[old_cell].remove(driver_id)
    cell_index[cell].add(driver_id)
    driver_cells[driver_id] = cell

def find_nearby_drivers(lat, lng, radius_km=5):
    center_cell = h3.latlng_to_cell(lat, lng, 9)
    # Get all cells within radius (k-ring)
    k = int(radius_km / 0.17)  # ~29 rings for 5km at res 9
    nearby_cells = h3.grid_disk(center_cell, k)

    drivers = []
    for cell in nearby_cells:
        drivers.extend(cell_index.get(cell, []))
    return drivers

Location update: O(1) -- just move the driver from one cell's set to another. Nearby driver query: O(K) where K is the number of cells in the search radius. At resolution 9 with a 5km radius, K is ~2,700 cells. Each cell lookup is O(1) in a hash map. Total: 2,700 hash lookups, returning 10-100 drivers. This takes microseconds in memory.

Handling 1.25M Location Updates Per Second

Drivers send GPS updates every 1 second via WebSocket. The system must ingest these and update the in-memory index.

Architecture:
  Driver App -> WebSocket Gateway (sticky by driver_id)
  -> Kafka topic "driver-locations" (partitioned by driver_id)
  -> Location Service (consumes, updates in-memory H3 index)

Why Kafka in the middle? If the Location Service crashes, Kafka retains the updates. On restart, the service replays recent messages to rebuild the in-memory index. Without Kafka, a crash means losing all driver locations until drivers send their next update.

Partitioning: The "driver-locations" topic has 128 partitions. Location Service runs 128 instances, each consuming one partition. Each instance maintains its own in-memory H3 index for its partition's drivers.

Cross-partition queries: "Find nearby drivers" may need drivers from multiple partitions. Two options:

Option A: Hash drivers to partitions by H3 cell (not driver_id). All drivers in a geographic area are on the same partition. Nearby queries hit 1-3 partitions.

Option B: Hash by driver_id. Each location service holds a random subset of drivers. Nearby queries fan out to all partitions and merge results. Latency: 2-5ms (parallel fan-out).

Uber reportedly uses Option A (geographic partitioning) for their DISCO (dispatch optimization) system. The trade-off: geographic partitioning means a driver crossing a partition boundary requires re-registration, but this is rare (happens at city boundaries, not every block).

Batched Bipartite Matching

Instead of greedily assigning each ride request to the nearest driver, batch requests over a short window (2-3 seconds) and solve a bipartite matching problem.

At time T, the matching system has:
  Riders waiting: [R1, R2, R3]
  Available drivers: [D1, D2, D3]

Cost matrix (estimated pickup time in minutes):
       D1    D2    D3
  R1   2     3    10
  R2  10    10     2
  R3   3    10    10

Greedy (assign each rider to their nearest available driver, in order):
  R1 -> D1 (2 min)  -- D1 is closest to R1
  R2 -> D3 (2 min)  -- D1 taken, D3 is closest remaining for R2
  R3 -> D2 (10 min) -- D1, D3 taken, only D2 left for R3
  Total: 14 min

  R3 is stranded with a 10-minute wait because R1 "stole" D1 -- the 
  only driver that was reasonably close to R3.

Hungarian algorithm (optimal):
  R1 -> D2 (3 min), R2 -> D3 (2 min), R3 -> D1 (3 min)
  Total: 8 min

  By giving R1 its second-best driver (D2, 3 min instead of 2 min),
  R3 gets D1 (3 min instead of 10 min). The global total drops from 
  14 min to 8 min -- a 43% reduction in total wait time, and no rider 
  waits more than 3 minutes.

The bipartite matching optimizes globally -- it minimizes total wait time across all riders, not individual wait time. This reduces the number of unmatched riders (the most expensive failure mode: a rider who gets no driver churns).

In practice: Uber uses a variant of the Hungarian algorithm (O(N^3)) for small batches (< 100 riders). For larger batches, they use approximate algorithms (auction algorithm, which is O(N^2 * log(N))) that are fast enough to run in the 2-3 second batch window.

Ride State Machine with Durable Execution

A ride goes through a sequence of states. Each transition has side effects (notifications, payments). If the server crashes mid-ride, the ride must resume from its last state.

State machine:
  REQUESTED -> MATCHED -> DRIVER_EN_ROUTE -> ARRIVED -> 
  TRIP_STARTED -> TRIP_COMPLETED -> PAYMENT_PROCESSED

  REQUESTED -> CANCELLED (rider cancels before match)
  MATCHED -> DRIVER_CANCELLED (driver cancels after match)
  TRIP_STARTED -> TRIP_CANCELLED (emergency cancellation)

Why a state machine matters: Each state transition triggers specific actions:

REQUESTED -> MATCHED:
  - Notify rider (driver found, ETA X minutes)
  - Notify driver (ride details, pickup location)
  - Start tracking driver location

ARRIVED -> TRIP_STARTED:
  - Start fare meter (time + distance)
  - Record trip start location
  - Disable surge pricing lock (rider locked in their surge rate)

TRIP_COMPLETED -> PAYMENT_PROCESSED:
  - Calculate final fare
  - Charge rider
  - Credit driver (minus commission)
  - Send receipt to rider
  - Request rating

Durable execution with Temporal/Cadence:

Uber built Cadence (later open-sourced as Temporal) specifically for long-running workflows like rides. Each ride is a Temporal workflow:

@workflow.defn
class RideWorkflow:
    @workflow.run
    async def run(self, ride_request):
        # Each step is durable -- if the worker crashes, 
        # Temporal replays from the last completed step

        driver = await workflow.execute_activity(
            find_and_match_driver, ride_request, timeout=timedelta(seconds=30))

        await workflow.execute_activity(
            notify_rider_and_driver, ride_request, driver)

        # Wait for driver to arrive (may take 10 minutes)
        await workflow.wait_condition(lambda: self.driver_arrived)

        # Wait for trip to start
        await workflow.wait_condition(lambda: self.trip_started)

        # Wait for trip to end (may take 30 minutes)
        await workflow.wait_condition(lambda: self.trip_completed)

        fare = await workflow.execute_activity(
            calculate_fare, ride_request, self.trip_data)

        await workflow.execute_activity(
            process_payment, ride_request, fare)

If the Temporal worker crashes after find_and_match_driver completes but before notify_rider_and_driver, Temporal restarts the workflow on a new worker, replays the history (the driver match result is already recorded), and continues from notify_rider_and_driver. No ride state is lost.

Surge Pricing Per H3 Cell

Surge pricing adjusts the fare multiplier based on local supply/demand ratio.

def calculate_surge(h3_cell, resolution=7):
    # Count available drivers in this cell and neighbors
    cells = h3.grid_disk(h3_cell, 1)  # cell + 6 neighbors
    supply = sum(count_available_drivers(c) for c in cells)

    # Count ride requests in last 5 minutes
    demand = count_recent_requests(h3_cell, window_minutes=5)

    if supply == 0:
        return MAX_SURGE  # no drivers, maximum surge

    ratio = demand / supply

    if ratio < 1.0:
        return 1.0        # more supply than demand, no surge
    elif ratio < 2.0:
        return 1.0 + (ratio - 1.0) * 0.5   # 1.0x to 1.5x
    elif ratio < 4.0:
        return 1.5 + (ratio - 2.0) * 0.25  # 1.5x to 2.0x
    else:
        return min(2.0 + (ratio - 4.0) * 0.1, MAX_SURGE)  # caps at 3.0x

Surge is calculated per H3 resolution-7 cell (about 5 km^2) every 30 seconds. With 200,000 active cells globally, that is 200K surge calculations every 30 seconds = 6,667/sec. Each calculation is a simple ratio -- the compute is trivial. The hard part is maintaining accurate supply/demand counts in real-time.

Deep Dives

Ride Sharing — Ride Sharing High-Level Design

Deep Dive 1: ETA Prediction

Estimated time of arrival is one of the most visible numbers in the app. Getting it wrong by 5 minutes frustrates riders.

Naive ETA: Distance / average_speed. For 5 km at 30 km/h = 10 minutes. Wrong 80% of the time because it ignores traffic, traffic signals, and road topology.

Production ETA (what Uber does):

Road graph: Use OpenStreetMap data to build a graph of roads with edge weights = travel time
Real-time traffic: Adjust edge weights based on current GPS data from drivers. If drivers on I-280 are moving at 15 mph instead of 65 mph, update the edge weight
Routing: Run A* or Dijkstra on the weighted graph from driver location to rider pickup
ML correction: Apply a learned correction factor based on historical trips with similar characteristics (time of day, day of week, weather, events)

The ML model trains on millions of historical trips where the actual travel time is known. Features: distance, route complexity, time of day, day of week, weather, live traffic speed, number of turns, number of traffic signals. Output: predicted travel time.

Industry ETA targets: Within 2 minutes for 80%+ of trips, within 5 minutes for 95%+.

Deep Dive 2: Payment and Financial Settlement

The payment flow has parallels to DTCC (Depository Trust & Clearing Corporation) clearing in stock markets.

Per-ride financial flow:

1. Rider requests ride at $25 (including 1.5x surge)
2. System authorizes (holds) $30 on rider's card (buffer for route changes)
3. Trip completes, actual fare = $27.50
4. System captures $27.50, releases the remaining $2.50 hold
5. Uber commission: $27.50 * 25% = $6.875
6. Driver earnings: $27.50 - $6.875 = $20.625
7. Driver payment batched and settled weekly (or instant cashout for a fee)

Why weekly settlement? Same reason DTCC settles stock trades on T+2: batching reduces transaction costs. Processing 28M individual payments per day costs ~$0.30 each = $8.4M/day in payment processing fees. Batching to weekly payments: 2M drivers * $0.30 = $600K/week = $86K/day. That is a 98x cost reduction.

Dispute handling: If a rider disputes a charge, the system must reconstruct the entire trip: GPS trace, fare calculation inputs (distance, time, surge), and payment authorization. This data must be retained for 180 days (chargeback window for credit cards).

Deep Dive 3: Safety and Fraud Detection

Driver verification: Background checks, vehicle inspection, real-time identity verification (periodic selfie checks against driver photo).

Trip anomaly detection: If the GPS trace shows the car driving 150 mph, or the trip distance is 10x the straight-line distance, flag for review. If the fare is significantly higher than the estimate (route manipulation), flag for review.

Ride-splitting fraud: A driver starts a ride, drives a short distance, then marks the ride as complete and starts a new ride with the same rider. This generates two minimum fares instead of one longer fare. Detect by checking: same driver + same rider + < 5 min between rides + similar GPS coordinates.

Alternative Designs

Alternative 1: Geohash Instead of H3

Use geohash (base-32 encoded latitude/longitude) for spatial indexing. Simpler to implement, widely supported in Redis (GEOADD/GEOSEARCH) and Elasticsearch.

Alternative 2: Redis Geo for Location Index

Use Redis GEOADD/GEOSEARCH for driver location storage and radius queries. Handles millions of members and radius queries in O(N+log(M)).

Redis Sorted Set TTL Gotcha

Redis sorted sets (used by GEOADD under the hood) do not support per-member TTL. If a driver goes offline, their location stays in the sorted set forever unless explicitly removed. This is a common interview trap -- candidates propose Redis Geo and then cannot explain how stale driver locations get cleaned up.

Solutions:

(a) Time-bucketed sorted sets. Create keys like drivers:bucket:{minute} (e.g., drivers:bucket:1423). Each location update goes into the current minute's bucket. Set a TTL of 2-3 minutes on the entire key. Queries search the current bucket plus the previous one. Stale data auto-expires when the key TTL fires.

(b) Separate K-V entries per driver with TTL. Store each driver's location as driver:loc:{driver_id} with a 60-second TTL. On each GPS update, refresh the TTL. Query the geo set, then cross-check against the K-V entries -- if the K-V entry is gone, the driver is stale. More memory, but precise per-driver expiry.

(c) External sweep process. Keep a secondary sorted set scored by last_update_timestamp. A background job every 30 seconds removes members whose timestamp is older than 60 seconds from both the geo set and the sweep set.

For an interview, option (a) is the cleanest answer -- it requires no background jobs and leverages Redis's native TTL mechanism.

Alternative 3: Event-Sourced Ride State

Instead of a state machine, use event sourcing. Every ride event (requested, matched, arrived, started, completed) is appended to an event log. Current state is derived by replaying events.

Aspect	H3 + Custom Index	Geohash + Redis	PostGIS	Elasticsearch
Update throughput	1M+/sec (in-memory)	100K/sec (Redis)	10K/sec (disk)	50K/sec (Lucene)
Query latency (5km)	<1ms	1-5ms	10-50ms	5-20ms
Distance uniformity	15% variance (hex)	41% variance (square)	Exact (PostGIS)	41% variance (geohash)
Operational complexity	High (custom)	Low (managed Redis)	Low (PostgreSQL)	Medium (ES cluster)
Multi-resolution	16 built-in levels	Variable prefix length	Manual	Multi-field
Surge pricing support	Native (H3 cells)	Custom aggregation	GROUP BY (slow)	Aggregation queries

H3 for Uber-scale systems where uniform distance and multi-resolution are critical. Redis Geo for smaller deployments (< 100K drivers) where simplicity matters. PostGIS for prototyping. Elasticsearch for systems that also need full-text search on ride data.

Scaling Math Verification

Location updates (1.25M/sec):

Kafka: 128 partitions, 9,766 msgs/partition/sec. Well within Kafka's capacity.
Location service: 128 instances, each handling ~9,766 updates/sec. Each update: hash to H3 cell + update hash map = ~1 microsecond. 1% CPU utilization per instance.
Memory per instance: ~10K drivers * 100 bytes = 1 MB per instance. Negligible.

Driver matching (324 rides/sec):

Each match: search 2,700 H3 cells, find 10-50 candidates, compute pickup ETA for top 10.
ETA computation: A* on road graph, ~10ms per candidate. 10 candidates * 10ms = 100ms.
With batching (2-second windows): 648 riders/batch, 50 candidate drivers each.
Bipartite matching (Hungarian): 648 riders * 50 drivers = 32,400 node bipartite graph. O(N^3) = too slow for N=32K.
Use auction algorithm: O(N^2 * log(N)) = feasible in ~500ms for N=32K.

Ride state storage:

5M active rides * 500 bytes = 2.5 GB. Fits in Redis or Temporal's backing store.
Temporal workflows: 5M concurrent workflows. Temporal is designed for this scale (Uber runs 100M+ workflow executions per day on Cadence).

Failure Analysis

Component	Current capacity	At 10x (12.5M updates/sec)	Breaks?	Fix
Kafka (128 partitions)	9.7K msgs/partition/sec	97K/partition	No	Kafka handles 100K+ msgs/partition
Location service (128)	9.7K updates/instance/sec	97K/instance	No	Still microseconds per update
Matching service	324 matches/sec	3,240 matches/sec	Maybe	More matching instances, larger batch windows
Surge calculation	200K cells / 30 sec	2M cells / 30 sec	No	Parallelize across workers
Temporal (rides)	5M concurrent workflows	50M concurrent	Yes	Shard Temporal across multiple clusters
Payment service	648 txns/sec	6,480 txns/sec	No	Single payment processor handles this
Road graph (ETA)	10ms per query	Same	No	Pre-compute ETAs during batch matching

The first bottleneck at 10x is Temporal's concurrent workflow capacity. 50M concurrent rides requires a massive Temporal deployment with multiple history shards. The fix is geographic sharding: each region (US-West, US-East, EU, etc.) runs its own Temporal cluster, and rides are routed to the cluster for their region.

What's Expected at Each Level

Aspect	Mid-Level	Senior	Staff+
Geospatial indexing	"Use PostGIS" or "use geohash"	In-memory spatial index, explains query pattern	H3 hexagonal grid, why hexagons > squares, resolution levels
Driver matching	Nearest available driver	Discusses why greedy is suboptimal	Batched bipartite matching, auction algorithm, batch window size
Location updates	"Update database"	Kafka + in-memory index	1.25M/sec throughput math, geographic partitioning
Ride lifecycle	REST API calls	State machine with database-backed state	Durable execution (Temporal/Cadence), crash recovery
Surge pricing	"Increase price when busy"	Supply/demand ratio per area	Per-H3-cell calculation, update frequency, pricing curve design
Payment	"Charge the rider"	Auth-capture flow, commission split	DTCC analogy, weekly batched settlement, chargeback handling
ETA	Distance / speed	Road graph routing (A*)	Real-time traffic from driver GPS, ML correction, accuracy SLOs
Real-world reference	"Like Uber"	Mentions H3 or geospatial indexing	Uber's DISCO, Cadence/Temporal origin, 1.25M updates/sec

The single most important signal at any level: do you understand that a ride-sharing platform is fundamentally a real-time geospatial system, not a CRUD app? The driver location index is the heart of the system, and every major design decision (matching algorithm, surge pricing, ETA) depends on efficient geospatial queries over millions of moving points.

References from Our Courses

Spatial Indexing — geospatial matching of riders to nearby drivers
Redis Data Structures and Use Cases — caching driver locations for real-time dispatch
Kafka Partitions and Ordering — event streaming for trip state transitions
Distributed Tracing — tracing request flow across matching and pricing services

Red Team This Design

Ready to stress-test this architecture? The Attack companion tears apart every decision in this design — from hardware physics to security holes to what actually happens at 10x scale.

Attack: Design a Ride-Sharing Platform →