Design a URL Shortener

A URL shortener is the "hello world" of system design interviews -- simple enough that you can finish in 35 minutes, deep enough that a staff-level candidate can still find interesting things to say. The trick is knowing which parts are actually hard (ID generation, redirect semantics) and which parts interviewers only care about as proof you've built real systems (database choice, caching).

The System

A URL shortener takes a long URL like https://example.com/very/long/path?with=params&and=more and produces something like https://short.ly/a8Kz3p. When anyone visits the short link, they get redirected to the original.

Bit.ly processes billions of link clicks per month across 4+ billion links created since launch. TinyURL handles similar scale. These systems power marketing campaigns, SMS messages (where character counts matter), and analytics dashboards that track how links perform across geographies and devices. The read-to-write ratio is extreme -- a link gets created once but clicked hundreds or thousands of times. That asymmetry shapes every architectural decision.

Requirements

Functional

Shorten: Given a long URL, return a short URL (e.g., short.ly/a8Kz3p)
Redirect: Given a short URL, redirect the user to the original long URL
Custom aliases: Users can optionally specify a custom short code (e.g., short.ly/my-promo)
Expiration: URLs can have an optional TTL after which they stop working

Non-Functional

Read latency: Redirect completes in under 50ms at the 99th percentile (users click short links in time-sensitive contexts like ads)
Write throughput: Handle 1M new URLs per day (peak: 120 writes/sec)
Read throughput: Handle 100M redirects per day at 100:1 read-write ratio (peak: ~12K reads/sec)
Durability: Once created, a URL mapping must not be lost until it expires
Availability: 99.9% uptime -- a broken short link in a marketing campaign costs real money

Back-of-Envelope Math

Write path:
  1M new URLs/day = 1,000,000 / 86,400 = ~12 writes/sec average
  Peak (10x): ~120 writes/sec
  A single Postgres instance handles 10,000+ writes/sec. This is trivial.

Read path:
  100:1 read-write = 100M redirects/day
  100,000,000 / 86,400 = ~1,160 reads/sec average
  Peak (10x): ~11,600 reads/sec

Storage:
  Each record: short_code (7B) + long_url (avg 200B) + created_at (8B) +
               expiration (8B) + user_id (16B) + overhead = ~500 bytes
  Per year: 365M URLs * 500B = ~183 GB/year
  5-year total: ~915 GB
  This fits on a single database server.

ID space:
  7 characters of base62 (a-z, A-Z, 0-9) = 62^7 = 3.5 trillion codes
  At 1M URLs/day: exhaustion in 9.6 million years
  Even at 100M URLs/day: exhaustion in 96,000 years
  7 characters is more than enough.

Cache:
  Top 20% of URLs get 80% of traffic (Zipf distribution)
  20M unique URLs accessed daily * 20% = 4M hot entries
  4M * 500 bytes = ~2 GB
  Fits in a single Redis instance with room to spare.

The Naive Design

┌──────────┐     ┌──────────────┐     ┌──────────────┐
│  Client   │────>│  Web Server  │────>│  PostgreSQL   │
│           │<────│  (monolith)  │<────│              │
└──────────┘     └──────────────┘     └──────────────┘

POST /shorten { url: "https://long-url.com/..." }
  -> Server generates short code
  -> INSERT INTO urls (short_code, long_url) VALUES (...)
  -> Return { short_url: "short.ly/a8Kz3p" }

GET /a8Kz3p
  -> SELECT long_url FROM urls WHERE short_code = 'a8Kz3p'
  -> Return 302 Redirect to long_url

Honestly? This works fine for a small team's internal link shortener. One server, one database, done.

Where Does This Break First?

It is not the write path (12 writes/sec is nothing). It is the read path -- every single redirect hits the database. At 11K reads/sec peak, your database becomes the bottleneck, and your p99 latency climbs because redirect queries compete with everything else.

Where It Breaks

The core tension in a URL shortener is the read-heavy asymmetry. The database can handle writes all day, but reads at 100x the write rate will saturate connection pools and push latency past your 50ms target.

There is also a subtler problem: ID generation. The naive approach -- hash the URL with MD5, take the first 7 characters -- sounds elegant until you run the birthday paradox math. Seven characters of base62 gives you 3.5 trillion possibilities, which sounds like a lot. But the 50% collision probability hits at only ~2.6 million URLs. At 1M URLs/day, you are dealing with collisions within the first few days. Not theoretical. Certain.

And the third thing most candidates miss: redirect semantics. The difference between HTTP 301 and 302 determines whether you even see the clicks that flow through your system.

The Real Design

                                    ┌─────────────────────────┐
                                    │    CDN / Edge Workers    │
                                    │  (Cloudflare Workers)   │
                                    └───────────┬─────────────┘
                                                │ cache miss
                                                v
┌──────────┐     ┌───────────────┐     ┌──────────────────┐
│  Client   │────>│ Load Balancer │────>│   Read Service   │
│           │<────│               │<────│                  │
└──────────┘     └───────────────┘     └────────┬─────────┘
                                                │ cache miss
                                                v
                                       ┌──────────────┐
                                       │  Redis Cache  │
                                       │  (~2 GB)      │
                                       └────────┬──────┘
                                                │ cache miss
                                                v
                                       ┌──────────────┐
                                       │  PostgreSQL   │
                                       │  (primary +   │
                                       │   read replica)│
                                       └──────────────┘

┌──────────┐     ┌───────────────┐     ┌──────────────────┐     ┌──────────┐
│  Client   │────>│ Load Balancer │────>│  Write Service   │────>│  Redis   │
│  (create) │<────│               │<────│                  │     │ Counter  │
└──────────┘     └───────────────┘     └────────┬─────────┘     └──────────┘
                                                │
                                                v
                                       ┌──────────────┐
                                       │  PostgreSQL   │
                                       │  (primary)    │
                                       └──────────────┘

Write Path

Client sends POST /api/shorten with the long URL.
Write service calls INCR short_url_counter on Redis to get the next unique integer.
Service base62-encodes the integer to produce the short code (e.g., 1000000 becomes "4c92").
Service inserts (short_code, long_url, created_at, expires_at) into Postgres with a UNIQUE constraint on short_code.
Service writes the mapping to Redis cache (write-through) so the first click is fast.
Returns the short URL to the client.

Why a counter instead of hashing? Counters never collide. Period. No birthday paradox, no retry logic, no Bloom filters. The counter approach is boring, and boring is exactly what you want for the core of a URL shortener.

Counter batching: Each write server requests 1,000 IDs at once from Redis (INCRBY short_url_counter 1000). It then uses IDs locally without further Redis calls. If a server crashes with 400 unused IDs in its batch, those IDs are simply lost. Acceptable -- you have 3.5 trillion of them.

Read Path (Redirect)

Client visits short.ly/a8Kz3p.
Read service checks Redis cache. Cache hit (~95% of requests): return the long URL.
Cache miss: query Postgres read replica. Populate cache on miss.
Check if the URL has expired. If yes, return 410 Gone.
Return an HTTP redirect to the long URL.

The redirect response includes a Location header with the original URL. The status code matters -- and it is the most debated decision in this entire design.

Custom Aliases

Custom aliases create a collision risk with generated codes. If a user creates the alias "abc123" and your counter later generates the same code, you have a conflict.

The cleanest fix: use separate namespaces. Generated codes live at short.ly/{code} and custom aliases live at short.ly/c/{alias}. No overlap possible, no special lookup logic needed.

Alternatively, reserve a range of the counter space (skip short codes that match custom aliases), but that adds complexity for little gain.

Deep Dives

Url Shortener — Url Shortener High-Level Design

Deep Dive 1: ID Generation -- Counters, Hashes, and the Birthday Paradox

This is the core technical decision in the design, and where most candidates either nail it or fumble.

The hash approach (and why it's worse than it looks)

The intuitive approach: hash the long URL with MD5, base62-encode it, take the first 7 characters.

import hashlib, base62
short_code = base62_encode(int(hashlib.md5(url.encode()).hexdigest(), 16))[:7]

The problem is the birthday paradox. The math is unforgiving:

Characters (base62)	Space size	50% collision at	Days at 1M/day
6 chars	56.8 billion	~267K	< 1 day
7 chars	3.5 trillion	~2.6M	2.6 days
8 chars	218 trillion	~20.9M	21 days
9 chars	13.5 quadrillion	~164M	164 days

With 7 characters, you hit 50% collision probability at roughly 2.6 million URLs. That is not a scale problem -- it is a day-three problem. You must handle collisions, which means retry logic, UNIQUE constraints, and increasing latency as collision probability rises.

The counter approach (what I'd actually use)

A simple Redis INCR gives you a monotonically increasing integer. Base62-encode it:

Counter value 1,000,000 -> base62 -> "4c92"
Counter value 3,521,614,606,208 -> base62 -> "zzzzzzz" (7 chars filled)

Zero collisions by construction. The only downside: sequential codes are predictable. An attacker can enumerate short.ly/4c90, short.ly/4c91, short.ly/4c92 to discover all URLs in sequence.

If predictability matters, apply a Feistel cipher before base62 encoding. A Feistel cipher is a reversible, fixed-width permutation -- it maps every integer in the range [0, N) to a unique other integer in the same range, but in a way that looks random. You get the unpredictability of hashing with the collision-free guarantee of a counter.

counter -> Feistel(counter, secret_key) -> base62 -> short_code

Other approaches worth mentioning

Strategy	Collisions?	Predictable?	Coordination needed?	Best for
Counter + base62	Never	Yes	Redis counter	Most systems
MD5 hash + truncate	Yes (~2.6M)	No	None (stateless)	When you need stateless gen
Pre-generated key pool	Never	No	KGS service	When unpredictability is hard requirement
Snowflake-style IDs	Never	Partially	Time sync only	When you also need ordering

I'd pick the counter approach for 90% of URL shortener interviews. It is the simplest, most reliable, and easiest to reason about. If the interviewer pushes on predictability, mention Feistel ciphers. If they push on single-point-of-failure, mention counter batching and Redis Sentinel.

Redis counter failure recovery: Query the database for SELECT MAX(short_code) FROM urls, decode base62 to get the last known counter value, add a safety margin of 10,000 (to account for in-flight batches), and reinitialize the counter. The UNIQUE constraint on short_code is the ultimate safety net.

Deep Dive 2: 301 vs 302 -- What Bit.ly Actually Does

Most candidates say "use 302 for analytics" and move on. That answer is fine but incomplete. Here is the full picture.

302 Found (Temporary Redirect)

The browser does not cache anything. Every click goes through your server. You see every single redirect, so your analytics are perfectly accurate.

The downside: your server handles 100% of the traffic. No browser-side caching helps you.

301 Moved Permanently

The browser caches the redirect. Subsequent clicks go directly to the destination without hitting your server. Great for reducing load. Terrible for analytics -- you can not count clicks you never see.

301 also transfers SEO "link juice" to the destination URL. Businesses running marketing campaigns care about this: if their Bit.ly link accumulates backlinks, a 301 passes that PageRank to the destination. A 302 does not.

What Bit.ly actually does: 301 + Cache-Control

If you curl -v bit.ly/4ox4Y67, you will see:

HTTP/1.1 301 Moved Permanently
Cache-Control: private, max-age=90
Location: https://destination-url.com/

This is a clever middle ground. The 301 gives SEO benefits. Cache-Control: private prevents CDN and shared proxy caches from caching the redirect (only the user's browser caches it). max-age=90 means the browser cache expires after 90 seconds. So the same user clicking the same link twice within 90 seconds only generates one server-side event, but after 90 seconds, the next click hits Bit.ly's servers again.

My recommendation: Start with 302 in the interview (safer, full analytics control). Then mention that Bit.ly uses the 301 + Cache-Control trick as a production optimization. The interviewer will appreciate that you know the real-world implementation.

Deep Dive 3: Expiration and Cleanup

URLs with a TTL need to stop working after expiration. But how do you actually clean them up?

Layer 1: Read-time check (lazy deletion)

On every redirect, check the expiration timestamp. If expired, return 410 Gone instead of redirecting. Also delete the entry from the cache.

if url_record.expires_at and url_record.expires_at < now():
    cache.delete(short_code)
    return Response(status=410)  # Gone

This catches all expired URLs that get clicked. Zero overhead for non-expired URLs (just a timestamp comparison). But it never cleans up URLs that nobody clicks -- they linger in the database forever.

Layer 2: Background sweep

A periodic job (hourly or daily) runs:

DELETE FROM urls
WHERE expires_at < NOW() - INTERVAL '1 hour'
LIMIT 10000;

Batch deletes in chunks of 10,000 to avoid long-running transactions. Run during off-peak hours. Must use an index on expires_at to avoid full table scans.

Layer 3: Database-native TTL (if using DynamoDB)

DynamoDB can automatically delete expired items if you enable TTL on a column. Items get removed within ~48 hours of expiration. No application code needed. If you choose DynamoDB, this is free cleanup.

For Redis cache entries, set EXPIRE on each key to match the URL's expiration time. The cache cleans itself.

Should you reuse expired short codes? Almost certainly not. If someone bookmarked short.ly/a8Kz3p and you reassign that code to a different URL, the bookmark now goes to the wrong place. The ID space is large enough (3.5 trillion codes for 7 chars) that exhaustion is not a real concern.

Alternative Designs

Alternative 1: Serverless at the Edge

Instead of a traditional server architecture, deploy the entire read path as Cloudflare Workers or AWS Lambda@Edge. The redirect logic runs at 200+ edge locations worldwide, reading from a globally distributed KV store (Cloudflare KV or DynamoDB Global Tables).

Write path stays centralized (API Gateway + Lambda + DynamoDB). Read path is fully distributed.

Alternative 2: Hash-Based with Bloom Filter

If you genuinely need stateless ID generation (multiple regions, no shared counter), use MD5 hashing with a Bloom filter for fast collision pre-checking. The Bloom filter (~1.8 bytes per item at 0.1% false positive rate) sits in front of the database. For 1 billion URLs, the Bloom filter is ~1.67 GB -- fits in memory.

On collision, retry with a salt/nonce. UNIQUE constraint on the database is the final safety net.

Aspect	Counter + Redis	Edge Workers + KV	Hash + Bloom Filter
Collision risk	Zero	Zero (uses counter)	Low (Bloom filter)
Read latency	5-10ms (cache hit)	1-5ms (edge)	5-10ms (cache hit)
Write complexity	Low	Medium (sync to KV)	Medium (retry logic)
Multi-region writes	Needs coordination	Needs coordination	Fully independent
Operational complexity	Low	Low (managed infra)	Medium (Bloom filter)
Monthly cost at scale	~$500-2K	~$200-1K	~$500-2K

I would default to the counter approach for most interviews and mention edge workers as an optimization if the interviewer asks about global latency.

Scaling Math Verification

Let's verify our design handles the target load.

Write path (120 writes/sec peak):

Redis INCR: handles 100K+ operations/sec on a single instance. Our 120/sec is 0.1% of capacity.
Postgres INSERT: a single instance handles 10K+ writes/sec. Our 120/sec is 1.2% of capacity.
Verdict: a single Redis counter and a single Postgres primary handle writes with enormous headroom.

Read path (11,600 reads/sec peak):

Redis cache hit rate: ~95% (Zipf distribution + top-20% caching)
Redis reads: 11,600 * 0.95 = 11,020 reads/sec. Redis handles 100K+ reads/sec. Fine.
Postgres reads (cache misses): 11,600 * 0.05 = 580 reads/sec. A read replica handles this easily.
Verdict: the read path handles 10x our peak with no issues.

Storage (5 years):

915 GB total. A single Postgres instance with a 1 TB SSD handles this. No sharding needed.
If you want headroom, add a second Postgres instance, but it is not required at this scale.

Failure Analysis

What breaks at 10x (10M new URLs/day, 1B redirects/day)?

Component	Current capacity	At 10x	Breaks?	Fix
Redis counter	100K+ ops/sec	1,200 writes/sec	No	--
Redis cache	2 GB, 100K reads/s	20 GB, 116K r/s	No	Larger instance or Redis Cluster
Postgres writes	10K+ writes/sec	1,200 writes/sec	No	--
Postgres reads	5K reads/sec	5,800 reads/sec	Maybe	Add read replica, improve cache hit rate
Storage	1 TB	9.15 TB (5 yr)	Yes	Shard by hash of short_code
Network/LB	Standard	~116K req/sec	Maybe	Multiple LB instances, edge caching

The first thing to break at 10x is storage (9 TB over 5 years). The fix is straightforward: shard Postgres by hash of the short code. Redirect lookups (the dominant operation) hit exactly one shard. Use consistent hashing so adding a shard does not require reshuffling all data.

The second thing to consider: at 1B redirects/day, you probably want edge caching (Cloudflare Workers, Lambda@Edge) to handle the read path closer to users. This drops latency from 10ms (cache hit at your data center) to 1-5ms (edge cache hit).

What breaks at 100x?

At 10B redirects/day (~116K reads/sec sustained), you need a distributed cache (Redis Cluster across regions), edge computing for the read path, and a multi-region database setup. At this point you are building Bit.ly, and you should probably just be running DynamoDB Global Tables with edge workers.

What's Expected at Each Level

Aspect	Mid-Level	Senior	Staff+
Requirements	Shorten + redirect, basic numbers	Read-write ratio, latency targets, storage est.	Full back-of-envelope with ID space math
ID generation	"Use a hash" or counter	Counter with base62, mention collision risk	Compare counter vs hash vs KGS, birthday paradox math, Feistel for obfuscation
Redirect semantics	"Use a redirect"	301 vs 302 trade-off	301 + Cache-Control (what Bit.ly does), SEO implications
Caching	"Add Redis"	Cache-aside pattern, LRU, sizing	Zipf distribution, cache warming, LFU vs LRU
Database	"Use Postgres" or "Use DynamoDB"	Justify choice, discuss UNIQUE constraint	Sharding strategy (hash of short_code), explain why sharding is probably unnecessary at stated scale
Expiration	"Add an expiry field"	Lazy deletion on read	Three-layer cleanup: lazy + sweep + native TTL
Rate limiting	Not mentioned	Mention token bucket per API key	Fixed-window or sliding-window, tie to write protection
Analytics	Not mentioned	Acknowledge as follow-up	Async via Kafka, HyperLogLog for unique visitors
Edge optimization	Not mentioned	CDN for redirect caching	Cloudflare Workers / Lambda@Edge for redirect logic at edge

The single most important thing at any level: show you understand that this is a read-heavy system and that caching is the first optimization, not database scaling.

References from Our Courses

Redis Data Structures and Use Cases — caching short-to-long URL mappings for read-heavy traffic
Redis Persistence and Replication — durability guarantees for cached redirect entries
HyperLogLog and Count-Min — approximate unique visitor counting for analytics

Red Team This Design

Ready to stress-test this architecture? The Attack companion tears apart every decision in this design — from hardware physics to security holes to what actually happens at 10x scale.

Attack: Design a URL Shortener →