Design a Music Streaming Service

TL;DR

Build Spotify -- a system that streams 100 million songs to 600 million MAU with gapless playback, offline sync, and personalized recommendations. The key insight that separates music from video streaming: there is no transcoding pipeline. A song is encoded once (to Ogg Vorbis at 3-4 quality levels) and that is it. Files are 3-10 MB, not gigabytes. Cache hit rates are dramatically higher (top 1% of songs account for 30% of plays). These differences make the CDN architecture simpler but the client experience harder -- users expect gapless playback (no silence between tracks), crossfade, offline sync with DRM, and shuffle that is "smarter" than random. Spotify's Discover Weekly recommendation pipeline processes 600 million users' listening history through collaborative filtering, NLP on music blogs, and audio feature extraction. The system design challenge is the client-side experience (offline, gapless, smart shuffle) combined with a recommendation system that keeps users engaged.

The System

Spotify. A user opens the app, taps a playlist, and music starts playing within 500 milliseconds. Each song flows into the next without a gap or pop. They download an album for offline listening on their flight. The app suggests a "Discover Weekly" playlist every Monday with 30 songs they have never heard but will probably love. They share a song to their Instagram story.

How is this different from video streaming? Several ways. First, audio files are tiny compared to video. A 3-minute song at 320 kbps is 7 MB. A 3-minute video at 1080p is 300 MB. This means audio streaming has fundamentally different caching economics -- you can cache more songs on fewer edge servers, and the cache hit rate is higher because popular songs are replayed thousands of times per day. Second, there is no per-user adaptive bitrate switching. Users choose a quality level (low/normal/high/very high) in settings, and all songs play at that quality. No ABR algorithm. Third, the client experience is far more complex: gapless playback, crossfade, offline downloads with DRM, and a queue/shuffle system that is surprisingly hard to get right.

Requirements

Functional Requirements

Requirement	Details
Song playback	Stream audio with < 500 ms start time. Support play, pause, skip, seek.
Gapless playback	No silence or audio artifact between consecutive tracks.
Quality selection	User selects quality: Low (24 kbps), Normal (96 kbps), High (160 kbps), Very High (320 kbps).
Offline sync	Download songs/albums/playlists for offline playback with DRM.
Search	Search by song title, artist, album. Full-text with typo tolerance.
Playlists	Create, edit, share playlists. Collaborative playlists (multiple editors).
Recommendations	Discover Weekly (30 songs/week), Daily Mixes, Release Radar.
Social	Share songs to social media. See what friends are listening to.

Non-Functional Requirements

Requirement	Target
Playback start time	< 500 ms (time to first audio frame)
Gapless transition	< 10 ms gap between tracks (imperceptible)
Availability	99.99% for playback. 99.9% for recommendations.
Offline capacity	Up to 10,000 songs per device.
Scale	600M MAU, 220M subscribers, 100M songs, 5B streams/day

Back-of-Envelope Math

Catalog:
  Total songs:               100 million
  Song duration:             3.5 minutes average
  File size (320 kbps OGG):  ~8 MB per song
  File size (96 kbps OGG):   ~2.5 MB per song
  Total catalog storage:     100M * (8.4 + 4.2 + 2.5 + 0.6) MB (4 quality levels) = 1.57 PB

  Compare to YouTube: 4 EB. Spotify's catalog is 400x smaller.

Streaming:
  Streams per day:           5 billion
  Streams per second:        ~58,000 avg, ~150,000 peak
  Average stream size:       5 MB (weighted avg across quality levels)
  Bandwidth:                 150K * 5 MB / 180 sec = 4.2 TB/sec peak

  Compare to Netflix: 1 Pbps. Spotify is 33x smaller.

  CDN bandwidth cost (at $0.02/GB):
    5B * 5 MB = 25 PB/day = $500K/day = $15M/month
    (Spotify actually negotiates far below list price with CDN providers)

Cache hit rates:
  Top 1% of songs:           ~30% of plays
  Top 10% of songs:          ~70% of plays
  Long tail (bottom 50%):    ~3% of plays
  Effective CDN cache hit:   ~95% (similar to Netflix but with a much smaller catalog)

User data:
  Listening history per user: ~5,000 streams (last 90 days)
  Stream record:             ~100 bytes (user_id, song_id, timestamp, duration, skip_flag)
  Daily stream records:      5B * 100 bytes = 500 GB/day
  90-day history:            45 TB

The number to focus on: a 100M song catalog at 1.57 PB total is almost comically small by modern standards. A single Google data center has more storage than this. The engineering challenge is not storage -- it is the client experience and recommendations.

Naive Design

S3 for song files. PostgreSQL for metadata. HTTP progressive download for playback.

Schema:

CREATE TABLE songs (
    song_id UUID, title VARCHAR, artist_id UUID,
    album_id UUID, duration_seconds INT, genre VARCHAR,
    s3_path_320 VARCHAR, s3_path_160 VARCHAR,
    s3_path_96 VARCHAR, s3_path_24 VARCHAR
);

CREATE TABLE playlists (
    playlist_id UUID, user_id UUID, name VARCHAR
);

CREATE TABLE playlist_songs (
    playlist_id UUID, song_id UUID, position INT
);

Playback:

GET /api/stream/{song_id}?quality=320
Server generates presigned S3 URL.
Client downloads the file and plays it.

This works. It is basically what SoundCloud did in 2012. But it does not provide gapless playback, it has no DRM, seeking requires downloading the entire file, and the cache hit rate at S3 is terrible (S3 is not a CDN).

Where It Breaks

Problem 1: No Gapless Playback

Progressive download plays one file at a time. Between songs, the player stops playback, starts downloading the next file, buffers, and starts playing. Even with pre-fetching, there is a 100-500 ms gap. For albums designed to be listened to continuously (Pink Floyd's "The Dark Side of the Moon," Beyonce's "Lemonade"), this ruins the experience.

Problem 2: Seeking Requires Full Download

If the user wants to skip to 2:30 in a 5-minute song, they must download the first 2:30 of audio before hearing anything. Progressive download does not support random access within a file. This makes seek operations slow and wastes bandwidth.

Problem 3: No DRM for Offline Downloads

If you download an MP3 file to the device, the user can copy it, share it, and the label loses revenue. Record labels require DRM (Digital Rights Management) for offline downloads. Without it, no label will license their catalog to you.

Problem 4: Shuffle Is Not Random

True random shuffle repeats songs too soon. In a 20-song playlist, true random has a 5% chance of playing the same song twice in a row and a ~93% chance of repeating a song within the first 10 plays (birthday problem). Users complain: "Spotify keeps playing the same songs!" Spotify's "smart shuffle" deliberately avoids repeats and spreads artists evenly. This requires a custom shuffle algorithm.

Problem 5: S3 Is Not a CDN

S3 serves from one region. A user in Tokyo fetching from us-east-1 has 200 ms latency per request. With 150K concurrent streams, S3 would need to handle 150K * 1 request every 10 seconds (segment fetches) = 15K requests/sec from one region. S3 handles this, but the latency is terrible.

Real Design

Music Streaming — Music Streaming High-Level Design

Architecture Overview

┌──────────────┐
│  Client App  │ ── playback engine, offline store, DRM
└──────┬───────┘
       │
┌──────┴───────┐
│  CDN Edge    │ ── serves audio segments
│  (Fastly/    │    95% cache hit rate
│   Akamai)    │
└──────┬───────┘
       │ (cache miss)
┌──────┴───────┐
│  Origin      │ ── S3 / GCS
│  Storage     │    1.57 PB catalog
└──────────────┘

┌──────────────┐
│  API Gateway │ ── metadata, playlists, search, auth
└──────┬───────┘
       │
┌──────┴───────┐     ┌──────────────┐     ┌──────────────┐
│  Metadata    │     │  Playlist    │     │  Recommendation│
│  Service     │     │  Service     │     │  Pipeline      │
│  (song info, │     │  (CRUD +     │     │  (collaborative│
│   search)    │     │   collab)    │     │   filtering)   │
└──────────────┘     └──────────────┘     └───────────────┘

Component 1: Audio Encoding and Storage

Unlike video, audio encoding is a one-time batch operation.

Encoding pipeline:

When a label delivers a song (typically WAV or FLAC, lossless, ~50 MB per song):

Transcode to Ogg Vorbis at 4 quality levels:
Low: 24 kbps (voice quality, ~0.6 MB per song)
Normal: 96 kbps (acceptable quality, ~2.5 MB)
High: 160 kbps (good quality, ~4.2 MB)
Very High: 320 kbps (near-CD quality, ~8.4 MB)
Segment the file into 5-10 second chunks (for seeking support and CDN caching). Each chunk is independently playable.
Encrypt each chunk with AES-128 for DRM.
Store all chunks in object storage (S3), organized as: s3://songs/{song_id}/{quality}/{chunk_00.ogg.enc}

Why Ogg Vorbis? Spotify uses Ogg Vorbis because it is royalty-free (unlike AAC/MP3) and has better quality-per-bitrate than MP3 at low bitrates. At 96 kbps, Vorbis sounds noticeably better than MP3. For paid subscribers, 320 kbps Vorbis is practically indistinguishable from lossless.

Encoding compute: Encoding one song to 4 qualities takes ~5 seconds on modern hardware. At 60,000 new songs per day (Spotify's reported rate), that is 60,000 * 5 = 300,000 seconds = 83 CPU-hours. A single 32-core server handles the daily encoding load in ~3 hours.

Component 2: Gapless Playback

Gapless playback means zero silence between consecutive tracks. This is surprisingly difficult.

The problem: Audio codecs (MP3, AAC, Vorbis) add padding to the start and end of encoded files. MP3 adds 1,152 samples of silence at the start (encoder delay) and a variable amount at the end (padding to fill the last frame). If you play two MP3s back-to-back, you hear a 26 ms gap (1,152 samples / 44,100 Hz).

Solution: Client-side gap trimming.

The encoder records the exact number of padding samples in the file metadata (e.g., Vorbis comment: ENCODER_DELAY=1024, ENCODER_PADDING=896).
The client reads these values.
When starting playback of a new track, the client skips the first ENCODER_DELAY samples.
When the current track ends, the client stops ENCODER_PADDING samples before the file ends.
The audio output is seamless.

Implementation: The client maintains two audio decoder instances. While track A is playing its last 3 seconds, the client starts decoding track B in the background (pre-decoding). When track A reaches its trimmed end point, the client immediately switches the audio output to track B's trimmed start point. The crossover happens at the sample level -- no gap.

Crossfade: For users who prefer crossfade (one song fades out while the next fades in), the client overlaps the last N seconds of track A with the first N seconds of track B, mixing the audio streams with a volume ramp. This requires both decoders running simultaneously.

Component 3: Offline Sync with DRM

Offline downloads must be encrypted so users cannot pirate the music.

DRM flow:

User taps "Download" on an album.
Client requests a license key from the DRM server: POST /api/drm/license { song_ids: [...], device_id: ... }.
DRM server returns AES-128 keys, one per song, valid for 30 days.
Client downloads encrypted audio chunks from CDN.
Chunks are stored in the app's private storage (not accessible to other apps).
On playback, the client decrypts each chunk in memory using the stored key.
Every 30 days (or when the subscription expires), the keys expire and offline playback stops.

License renewal: When the user goes online, the client silently renews DRM licenses for all downloaded content. If the user does not go online for 30 days, offline playback stops and the app prompts them to connect.

Anti-piracy measures:

Keys are stored in the device's secure enclave (iOS Keychain, Android Keystore), not in app storage.
Decrypted audio is never written to disk. It is decrypted in memory and sent directly to the audio output.
The app detects rooted/jailbroken devices and disables offline downloads on them.
Content fingerprinting: each user's downloads are watermarked with an inaudible user-specific signature. If a pirated copy surfaces, the watermark identifies who leaked it.

Component 4: Spotify Smart Shuffle

True random shuffle has a well-documented problem: it does not feel random to humans. Humans expect even distribution (no repeats, no clusters of the same artist), but true randomness produces clumps.

Spotify's shuffle algorithm:

Divide songs by artist. If the playlist has 30 songs from 6 artists (5 songs each), create 6 buckets.
Space artists evenly. Place one song from each artist at positions 0, 5, 10, 15, 20, 25.
Randomize within each artist's positions. Within each artist's 5 slots, randomize the order of their songs.
Add jitter. Shift each song by a random offset of +/- 1 position to avoid perfectly even spacing (which sounds mechanical).

Result: The shuffle ensures that you never hear two songs from the same artist back-to-back (unless the playlist is dominated by one artist). Songs are spread evenly but with enough randomness to feel natural.

Enhanced Shuffle (2023): Spotify added "Smart Shuffle" which inserts recommended songs between playlist songs. The system analyzes the playlist's audio features (tempo, energy, valence) and inserts songs that fit the vibe. This is a recommendation feature disguised as a shuffle feature.

Component 5: Discover Weekly Recommendation Pipeline

Every Monday, 600 million users receive a personalized playlist of 30 songs they have never heard. This is Spotify's most celebrated feature.

Pipeline stages:

Stage 1: Collaborative Filtering

User A listens to: Song 1, Song 2, Song 3, Song 4
User B listens to: Song 1, Song 2, Song 3, Song 5
User C listens to: Song 2, Song 3, Song 5, Song 6

Collaborative filtering says:
  User A is similar to User B (3 songs in common out of 4).
  User B listens to Song 5. User A does not.
  Recommend Song 5 to User A.

Spotify uses matrix factorization (similar to Netflix Prize recommendations). The user-song interaction matrix is factored into user vectors and song vectors. The dot product of a user vector and a song vector predicts how much the user will like the song. Songs with high predicted scores that the user has not heard are candidates.

Stage 2: NLP on Music Blogs and Reviews

Spotify crawls thousands of music blogs and reviews. NLP extracts associations: "If you like Radiohead, you'll love Bon Iver." These associations create edges in a knowledge graph that supplements collaborative filtering.

Stage 3: Audio Feature Analysis

For songs with insufficient listening data (new releases, obscure artists), Spotify analyzes the raw audio using a convolutional neural network. The model extracts features: tempo, key, energy, danceability, valence (happiness). Songs with similar audio features are considered similar.

Pipeline execution:

Run frequency:               Weekly (every Sunday night)
Input data:                  90 days of listening history * 600M users = 45 TB
Matrix factorization:        ~10,000 CPU-hours (Spotify uses Google Cloud Dataflow)
Output:                      600M users * 30 songs * 16 bytes = ~290 GB of recommendations
Generation time:             ~8 hours (must complete by Monday 5 AM in each timezone)

Component 6: Audio Streaming Protocol

Spotify does not use HLS or DASH. It uses a custom protocol optimized for audio.

Key differences from video ABR:

No adaptive bitrate per segment. The user selects a quality level in settings. All songs play at that quality. No need for ABR manifests.
Pre-fetching. When the current song is 75% complete, the client starts downloading the first 3 segments of the next song. This enables gapless playback and < 100 ms skip-to-next.
Seek within a song. The client requests a specific byte range from the CDN: GET /song/123/160/chunk_05.ogg.enc, Range: bytes=0-10240. The CDN serves the range. The client decodes from the nearest keyframe.
Bandwidth estimation. The client measures download speed and warns the user if their connection cannot sustain the selected quality level. It does NOT automatically downgrade quality (unlike video ABR). The user must manually switch to a lower quality if they want uninterrupted playback.

Why not auto-downgrade? Because audio quality changes are perceptible and jarring. In video, switching from 1080p to 720p is barely noticeable. In audio, switching from 320 kbps to 96 kbps sounds like someone threw a blanket over the speakers. Users prefer a brief buffer to a sudden quality drop.

Deep Dives

Music Streaming — Music Streaming Gapless

Deep Dive 1: CDN Architecture for Audio

Audio CDN has different economics than video CDN.

Cache-friendliness: A 3-minute song at 320 kbps = 8 MB. A CDN edge server with 10 TB of cache can store 1.25 million songs. The top 1 million songs account for ~80% of plays. So a single edge server with 10 TB cache achieves ~80% hit rate. Add a second tier (mid-tier cache) with 100 TB, and you cache the top 10 million songs = 95% hit rate.

Compare to video: Netflix's catalog at 50 GB per title * 17,000 titles = 850 TB. No single edge server caches the entire catalog. Audio's small file sizes fundamentally change the caching architecture.

Spotify's CDN strategy: Spotify uses a combination of Google Cloud CDN, Fastly, and Akamai. They do not operate their own CDN (unlike Netflix's Open Connect). Why? Because the catalog is small enough for third-party CDNs to cache effectively, and Spotify does not generate enough bandwidth to justify building their own CDN hardware.

Pre-positioning: For new major releases (Taylor Swift album drop), Spotify pre-positions the files on all CDN edges before the release time. At midnight, when millions of users press play simultaneously, the files are already cached everywhere. No origin fetch storm.

Deep Dive 2: Collaborative Playlist Consistency

Collaborative playlists (multiple users can add/remove/reorder songs) introduce concurrency problems.

Scenario: User A adds Song X at position 5. Simultaneously, User B removes the song at position 3. After both operations:

If A's add is applied first: playlist is [1, 2, 3, 4, X, 5, 6]. Then B's remove of position 3 removes song 4, not the intended song. Incorrect.
If B's remove is applied first: playlist is [1, 2, 4, 5, 6]. Then A's add at position 5 inserts X at [1, 2, 4, 5, X, 6]. Both operations are correct in isolation but the combined result depends on order.

Solution: Operational Transformation (OT).

Spotify uses a variant of OT for collaborative playlists (the same approach used in Google Docs). Each operation is transformed based on concurrent operations:

If concurrent: add(pos=5, song=X) and remove(pos=3)
  B's remove shifts A's target position: since remove(3) < add(5), A's effective position becomes 4.
  Result: remove(3) then add(4, X), or equivalently add(5, X) then remove(3).
  Both orderings produce the same result.

Implementation: The playlist service assigns a monotonically increasing version number to each operation. Clients send operations with the version they are based on. The server rebases operations onto the current version (OT transformation) before applying.

At Spotify's scale: This is not high-throughput. Even a popular collaborative playlist gets ~10 edits per hour. The OT complexity is justified by correctness, not performance.

Deep Dive 3: Royalty Payment Tracking

Every stream generates a royalty payment to the rights holder. The payment is proportional to the total streams on the platform.

Spotify's pro-rata model:

Total subscriber revenue:     $1 billion/month (hypothetical)
Total streams:                5B/day * 30 = 150B/month
Revenue per stream:           $1B / 150B = $0.0067 per stream
Artist X's streams:           10M streams/month
Artist X's payment:           10M * $0.0067 = $67,000/month

System design impact:

Every stream must be recorded accurately. Undercounting = underpayment (legal liability). Overcounting = overpayment (financial loss).
A "stream" counts only if the user listens for > 30 seconds. Skip before 30 seconds = no payment.
Streams must be attributed to the correct rights holder (song writer, performer, label, distributor -- often different entities).
Payment reports are generated monthly and must be auditable.

Stream counting pipeline:

Client sends play event: { user_id, song_id, play_duration, timestamp }
Server processes event:
  if play_duration >= 30 seconds:
    Kafka -> stream_counts topic
    Consumer increments: INCR song:{song_id}:monthly_streams
    Consumer logs: append to audit log (immutable)
  else:
    Log but do not count as a stream

Fraud detection: Bots that play songs on mute to inflate stream counts (and thus royalties) are a real problem. Spotify detects this via behavioral analysis: real users skip songs, change volume, and listen at varying times. Bot accounts play 24/7, never skip, and have no social connections.

Alternative Designs

Approach	Pros	Cons	When to Use
Custom protocol + third-party CDN (described above)	Gapless playback. Smart shuffle. DRM integrated. Efficient caching.	Custom protocol requires custom client. Complex client engineering.	Spotify, Apple Music, Amazon Music.
HLS/DASH audio streaming	Standard protocol. Works with any player. ABR built in.	ABR quality switching is jarring for audio. No gapless playback without custom client code. Over-engineered for audio (designed for video).	Podcast streaming, live radio, when using existing video player infrastructure.
Progressive download (full file)	Simple. Universal support. Client can cache entire song.	No seeking without full download. No DRM. High startup latency for large files.	SoundCloud (early days), self-hosted audio.
WebRTC for live audio	Ultra-low latency (< 500 ms). Peer-to-peer possible.	No seeking. No offline. Not designed for on-demand catalogs.	Live concerts, DJ sets, real-time collaboration.
Decentralized streaming (blockchain-based)	Direct artist-to-listener payment. No platform middleman.	No practical discovery. No playlists. Payment per stream is complex on-chain. Terrible UX.	Niche. Not practical for mainstream music streaming.

Scaling Math Verification

CDN and Streaming

Concurrent streams (peak):   20 million (estimated from 600M MAU, 10% concurrent during peak)
Average bitrate:             160 kbps (weighted average across quality tiers)
Peak bandwidth:              20M * 160 kbps = 3.2 Tbps
CDN cache hit rate:          95%
Origin bandwidth:            3.2 Tbps * 0.05 = 160 Gbps

CDN cost:
  Daily data transfer:       5B streams * 5 MB avg = 25 PB/day
  At $0.01/GB (negotiated):  25 PB * 1024 * 1024 GB/PB * $0.01 = $262K/day = $7.9M/month

Catalog on CDN edge (10 TB cache):
  Songs cached:              10 TB / 5 MB avg = 2M songs
  Percentage of catalog:     2M / 100M = 2%
  Hit rate from top 2%:      ~75% (Pareto distribution)

Metadata Service

Song metadata records:       100 million
Record size:                 ~2 KB (title, artist, album, duration, genre, audio features)
Total metadata:              200 GB
Cache in Redis:              200 GB (fits in a Redis cluster with 10 nodes * 24 GB each)
Metadata queries/sec:        150,000 (one per stream start + search + browse)
Redis throughput:            100K ops/sec per node * 10 nodes = 1M ops/sec (massive headroom)

Recommendation Pipeline

Input data:                  45 TB (90 days of listening history)
Matrix factorization:        ~10,000 CPU-hours on Google Cloud
Output:                      600M users * 30 songs * 16 bytes = 290 GB
Generation time:             8 hours (batch job)
Cloud compute cost:          10,000 CPU-hours * $0.04/hr (preemptible) = $400 per run
Weekly cost:                 $400 (trivially cheap for a $12B/year revenue company)

Stream Counting

Streams per day:             5 billion
Stream events per second:    ~58,000
Event size:                  ~100 bytes
Kafka throughput:            58K * 100 = 5.8 MB/sec (single broker handles this)
Monthly aggregation:         150B stream records * 100 bytes = 15 TB
Storage for 12-month audit:  180 TB (S3 Glacier, $0.004/GB = $720/month)

Failure Analysis

Failure	Impact	Mitigation
CDN edge goes down	Users in that region experience buffering or playback failure.	Multi-CDN strategy (Fastly + Akamai + Google). DNS-level failover. Client retries from alternate CDN origin within 2 seconds.
DRM license server down	Users cannot start new offline downloads. Existing downloads continue working (keys cached locally for 30 days).	DRM server cluster with 3x redundancy. Pre-cache 30-day keys on device. Graceful degradation: stream online if offline playback keys expired.
Recommendation pipeline fails	Discover Weekly is not generated. Users see stale playlists from last week.	Re-run pipeline with higher priority. If not ready by Monday 5 AM, serve last week's playlist. Users mildly disappointed but not blocked.
Stream counting drops events	Under-report streams. Artists underpaid. Legal liability.	Kafka with 3x replication and acks=all. Consumer processes with exactly-once semantics (Kafka transactions). End-of-day reconciliation compares stream counts with CDN access logs.
Song file corrupted in storage	Playback of that song fails. Users hear audio artifacts or silence.	Checksums on every file. CDN returns Content-MD5 header. Client verifies. On mismatch, fetch from origin. Origin has 3x replicated storage (S3's 11 nines durability).
Collaborative playlist conflict	Songs appear in wrong order. Duplicate entries. Missing songs.	Operational transformation ensures consistent state regardless of operation order. Conflict resolution: last-write-wins for metadata changes, OT for structural changes (add/remove/reorder).
Subscription expires while offline	User has downloaded songs but cannot play them because DRM keys expired.	Client shows clear message: "Connect to internet to verify subscription." Grace period: allow offline playback for 48 hours after key expiration.

Level Expectations

Level	What the Interviewer Expects
Mid (L4)	S3 for song storage. CDN for delivery. Basic play/pause/skip API. Playlist CRUD. Search with Elasticsearch. Mentions offline as a requirement.
Senior (L5)	Gapless playback with pre-decoding and encoder padding trimming. Audio segmenting for seeking and CDN caching. DRM with device-bound keys and expiration. Smart shuffle algorithm (artist-aware spacing). Collaborative filtering for recommendations. Cache economics (audio vs. video). Stream counting with 30-second threshold.
Staff+ (L6)	Spotify Discover Weekly pipeline (collaborative filtering + NLP + audio features). DRM watermarking for piracy tracking. Royalty payment tracking and fraud detection (bot streams). CDN pre-positioning for major releases. Collaborative playlist OT for consistency. Audio codec selection rationale (Ogg Vorbis vs. AAC vs. Opus). Quantified CDN caching advantage of audio over video. Reference to Spotify's actual architecture and scale numbers.

References from Our Courses

Redis Data Structures and Use Cases — caching popular tracks and session state
API Gateway Responsibilities — CDN edge delivery for audio chunk streaming
Kafka Partitions and Ordering — play event streaming for recommendations and royalties

Red Team This Design

Ready to stress-test this architecture? The Attack companion tears apart every decision in this design — from hardware physics to security holes to what actually happens at 10x scale.

Attack: Design a Music Streaming Service →