Building a Market Order Book
Level: L3-L5 Topics: Sorted Data Structures, Heaps, Balanced BSTs, System Design
Problem Statement
You are building a marketplace for a single product "ABC". Sellers enter the marketplace by posting an offer with a price (in the range $0.01 to $100.00) and a timestamp (when they entered).
Design an order book that supports two operations:
insert_order(seller_id, price, timestamp)-- Add a new sell order to the book.get_lowest_seller_id()-- Return theseller_idof the cheapest available order. If multiple sellers have the same price, return the one who entered earliest (lowest timestamp).
Background & Constraints
- Prices are in the range $0.01 to $100.00, with two decimal places (i.e., prices are in cents from 1 to 10000).
- Timestamps are unique across all sellers.
- The order book may contain thousands to millions of active orders.
- Both operations should be as fast as possible.
Examples
insert_order("seller_A", 5.00, 100)
insert_order("seller_B", 3.50, 200)
insert_order("seller_C", 3.50, 150)
get_lowest_seller_id() → "seller_C"
(seller_C and seller_B both have price 3.50, but seller_C has earlier timestamp 150)
insert_order("seller_D", 3.50, 250)
get_lowest_seller_id() → "seller_C"
(still seller_C -- earliest at the lowest price)
Hints & Common Pitfalls
Approach 1: Brute Force -- Unsorted List
insert_order: O(1) -- just append.get_lowest_seller_id: O(N) -- scan all orders.- Simple but slow for frequent queries.
Approach 2: Min-Heap
insert_order: O(log N) -- heap insertion with key(price, timestamp).get_lowest_seller_id: O(1) -- peek at the top of the heap.- Good general-purpose approach.
Approach 3: Sorted Map / Balanced BST
- Use a tree map with key
(price, timestamp). insert_order: O(log N).get_lowest_seller_id: O(1) -- look at the minimum key.- Also supports efficient deletion (needed for follow-ups).
Approach 4: Bucket Array (Exploiting Price Range)
- Since prices have only 10,000 possible values (cents from 1 to 10000), create an array of 10,000 buckets. Each bucket holds a queue of seller IDs sorted by timestamp.
insert_order: O(1) -- append to the bucket for this price.get_lowest_seller_id: O(1) amortized -- maintain a pointer to the lowest non-empty bucket.- This is the optimal approach given the bounded price range.
Follow-Up Questions
-
Remove impatient sellers: Add an operation
remove_order(seller_id)for sellers who get tired of waiting and leave. How does each approach handle deletion? Which data structure makes removal efficient? -
Extend to any price and quantity: Remove the \(0.01-\)100.00 constraint. Prices can be any positive decimal. Each order also has a quantity.
get_lowest_seller_id()now also returns the quantity. How does this change your design? -
Matching engine: Add a
buy_order(buyer_id, max_price, quantity)operation that matches the buyer with the cheapest available sell orders up tomax_price, consuming orders from the book. How do you implement this efficiently? -
Concurrency: If multiple threads are inserting and querying simultaneously, how do you ensure thread safety? What locking strategy minimizes contention?
-
Persistence: If the system crashes, how do you recover the order book? What logging or snapshotting strategy would you use?