Skip to content

Design a Cloud File Sync Service

This question separates candidates who have actually thought about distributed systems from candidates who memorized a CRUD API template. The URL shortener is a database problem. This is a synchronization problem -- and synchronization is where things get genuinely hard.

The System

A cloud file sync service (think Dropbox, Google Drive, OneDrive) stores files in the cloud and keeps them synchronized across all of a user's devices. Edit a spreadsheet on your laptop, and within seconds it appears on your phone. Two people sharing a folder both see changes in near-real-time.

Dropbox has over 700 million registered users and stores exabytes of data. They migrated off AWS S3 to their own custom storage system (Magic Pocket) because at their scale, building and operating custom storage was cheaper by tens of millions of dollars per year. Before that migration, they were one of S3's largest customers. The engineering team that built the initial system that served 30 million users? Four people. The system that powered the $10B IPO was fundamentally the same architecture scaled up -- chunking, content-addressable storage, and metadata separation. These decisions made at the start carried the company for a decade.

Requirements

Functional

  • Upload files: Users can upload files of any size (1 KB to 50 GB) from any device
  • Download files: Users can download any file they have access to
  • Sync across devices: Changes on one device propagate to all other devices automatically
  • File sharing: Users can share files/folders with other users with view or edit permissions
  • Version history: Users can view and restore previous versions of a file

Non-Functional

  • Sync latency: Changes appear on other devices within 5 seconds for files under 10 MB on a stable connection
  • Upload efficiency: Editing a single paragraph in a 100 MB document should not re-upload 100 MB (delta sync)
  • Durability: 99.999999999% (eleven nines) -- losing a user's file is an extinction-level event for a storage company
  • Availability: 99.9% uptime -- users tolerate brief outages but not data loss
  • Scale: 500 million users, 100 million daily active, average 2 GB stored per user

Back-of-Envelope Math

Storage:
  500M users * 2 GB average = 1 exabyte total storage
  With deduplication (20-30% savings): ~700-800 PB effective
  S3 cost at $0.023/GB/month: $23M/month = $276M/year
  This is why Dropbox built Magic Pocket.

Upload traffic:
  100M DAU, assume 10% modify files daily = 10M file changes/day
  Average file size: 5 MB
  10M * 5 MB = 50 TB of new/changed data per day
  With delta sync (60% savings): ~20 TB actual upload per day

Metadata:
  500M users * 200 files average = 100 billion file metadata records
  Each record: ~500 bytes (path, size, timestamps, blocklist hash, permissions)
  Total metadata: ~50 TB
  This fits in a sharded database cluster.

Sync notifications:
  10M file changes/day = ~116 changes/sec average
  Each change notifies an average of 3 devices per user = 348 notifications/sec
  Plus shared folder notifications: ~500 notifications/sec total
  With burst (10x): ~5,000 notifications/sec

Chunk storage:
  Average file: 5 MB = ~1-2 chunks at 4 MB target chunk size
  100 billion files * 1.5 chunks = 150 billion chunks
  Block index entry: ~100 bytes per chunk
  Block index size: 15 TB

The Naive Design

┌──────────┐     ┌──────────────┐     ┌──────────────┐     ┌──────────┐
│  Client   │────>│  API Server  │────>│  PostgreSQL   │     │   S3     │
│  (laptop) │<────│              │<────│  (metadata)   │     │  (files) │
└──────────┘     └──────────────┘     └──────────────┘     └──────────┘

Upload:
  1. Client uploads entire file to API server
  2. Server stores file in S3
  3. Server writes metadata (name, size, S3 key) to Postgres
  4. Server returns success

Download:
  1. Client requests file
  2. Server generates presigned S3 URL
  3. Client downloads directly from S3

Sync:
  1. Client polls every 30 seconds: "anything new since timestamp X?"
  2. Server queries: SELECT * FROM files WHERE user_id = ? AND updated_at > ?
  3. Client downloads changed files

This works for a homework project. It does not work for Dropbox.

Where Does This Break First?

Editing a single cell in a 500 MB Excel file re-uploads the entire 500 MB. A user on a train with spotty WiFi uploads 400 MB, loses connection, and has to start the entire upload over. Two people editing the same shared document create a race condition where the last upload silently overwrites the other person's changes. And polling every 30 seconds means your coworker waits half a minute to see your changes.

Where It Breaks

Three things break simultaneously, and each one is a hard problem on its own:

1. Bandwidth waste. Without chunking and delta sync, every file modification re-uploads the entire file. A 1 KB edit in a 1 GB video re-uploads the full gigabyte. Multiply this by millions of users and you are burning bandwidth (and money) at an absurd rate.

2. No resumability. Without chunking, an interrupted upload of a large file means starting over from byte zero. On unreliable networks (mobile, public WiFi), large file uploads may never complete.

3. Conflict blindness. Without versioning and conflict detection, concurrent edits to the same file result in silent data loss. The last writer wins, and nobody knows the first writer's changes existed.

The Real Design

┌────────────────────────────────────────────────────────────────────┐
│                         Client (Desktop App)                       │
│  ┌─────────────┐  ┌──────────────┐  ┌───────────────────────────┐ │
│  │ File Watcher │  │ Chunking     │  │ Local Metadata DB         │ │
│  │ (FSEvents/   │──│ Engine       │  │ (SQLite: file states,     │ │
│  │  inotify)    │  │ (Rabin CDC)  │  │  chunk hashes, cursors)   │ │
│  └─────────────┘  └──────────────┘  └───────────────────────────┘ │
└───────────┬──────────────────────────────────────┬────────────────┘
            │ upload new chunks                     │ sync notifications
            v                                       v
     ┌──────────────┐                    ┌──────────────────┐
     │  Block Server │                    │ Notification     │
     │  (upload/     │                    │ Server           │
     │   download    │                    │ (WebSocket +     │
     │   chunks)     │                    │  long-poll        │
     └──────┬───────┘                    │  fallback)       │
            │                            └────────┬─────────┘
            v                                     │
     ┌──────────────┐    ┌──────────────┐         │
     │  S3 / Blob   │    │  Metadata    │<────────┘
     │  Store       │    │  Service     │
     │  (chunks by  │    │             │
     │   SHA-256)   │    └──────┬───────┘
     └──────────────┘           │
                                v
                       ┌──────────────┐     ┌──────────────┐
                       │  MySQL       │     │  Redis        │
                       │  (sharded    │     │  (cursors,    │
                       │   metadata)  │     │   sessions)   │
                       └──────────────┘     └──────────────┘

Upload Path (The Interesting Part)

  1. File watcher detects a change. The OS-level file watcher (FSEvents on macOS, inotify on Linux, ReadDirectoryChangesW on Windows) fires when a user saves a file.

  2. Client chunks the file using content-defined chunking (CDC). This is not fixed-size chunking. A Rabin fingerprint rolls over the file's bytes, and chunk boundaries are placed where the fingerprint meets a condition (e.g., lowest 22 bits are zero). Target chunk size: ~4 MB. This produces variable-size chunks whose boundaries are determined by content, not position.

  3. Client computes SHA-256 of each chunk. These hashes are the chunks' identities.

  4. Client sends the new blocklist (list of chunk hashes) to the metadata service. "Here are the hashes for the new version of report.docx."

  5. Metadata service compares against the stored blocklist. "You have 12 chunks. I already have 10 of them. Upload chunks 3 and 7."

  6. Client uploads only the new/changed chunks to the block server, which stores them in S3 (or equivalent blob store) keyed by their SHA-256 hash.

  7. Metadata service commits the new version. Updates the file's blocklist in the database, increments the file version, writes a journal entry with a new cursor.

  8. Notification server pushes the change to all other devices subscribed to this namespace (the user's files or the shared folder).

Download/Sync Path

  1. Other devices receive a notification: "file X changed, new cursor is 4827."
  2. Device fetches journal entries since its last cursor from the metadata service.
  3. For each changed file, device receives the new blocklist.
  4. Device checks which chunks it already has locally (from previous versions of this file, or from other files).
  5. Device downloads only the missing chunks from the block server.
  6. Device reconstructs the file locally from the blocklist of chunks.
  7. Device updates its local cursor to 4827.

Why Content-Defined Chunking Matters

This is the technical detail that separates a good answer from a great one.

Fixed-size chunking divides a file into equal blocks (e.g., every 4 MB). The problem: inserting a single byte at the beginning of the file shifts every subsequent chunk boundary. Every chunk looks different. The entire file must be re-uploaded.

Content-defined chunking uses a rolling hash (Rabin fingerprint) to determine boundaries based on the content itself. Inserting a byte at position X only affects the chunk containing position X. All other chunk boundaries remain stable because they depend on local content, not global position.

Result: editing one paragraph in a 100 MB document changes 1-2 chunks out of ~25. You upload ~8 MB instead of 100 MB. That is a 92% bandwidth saving for a typical edit.

Deep Dives

Cloud File Sync — Cloud File Sync High-Level Design

Deep Dive 1: Content-Addressable Storage and Deduplication

Every chunk is stored by its SHA-256 hash. This creates automatic deduplication as a free side effect.

Same file, different users: If 1,000 users all upload the same company-wide PDF, only one copy of each chunk is stored. All 1,000 users' metadata points to the same physical chunks. At large scale, cross-user deduplication typically saves 20-30% of raw storage.

Same file, different versions: When a user edits a file, most chunks remain identical to the previous version. Only changed chunks are new. Version history costs very little additional storage.

The dedup check flow:

Client: "I want to upload chunk with hash abc123"
Server: "I already have abc123. Skip."
Client: "I want to upload chunk with hash def456"  
Server: "That's new. Upload it."

This is why the upload conversation starts with hashes, not data. The client never uploads a chunk the server already has, regardless of which user or file originally created it.

The garbage collection problem: When a file is deleted or updated, old chunks may become orphaned (no file references them). You need either reference counting (increment on file create/update, decrement on delete) or periodic mark-and-sweep garbage collection to reclaim storage. Reference counting is simpler but breaks if a counter update is lost. Mark-and-sweep is more reliable but requires scanning all file metadata periodically.

Dropbox uses reference counting with periodic reconciliation sweeps as a safety net.

Deep Dive 2: Conflict Resolution

What happens when two users edit the same file on different devices before either sync completes?

Last-Writer-Wins (LWW): The most recent write overwrites the other. Simple to implement. Causes silent data loss. Unacceptable for a file sync service that people trust with their work.

Dropbox's actual approach: conflict copies. The server maintains a version number for each file. When a client uploads changes, it includes the parent version it based its edits on:

Client A: "Here's a new version of report.docx, based on version 5"
Server:   "Current version is 5. Accepted. New version is 6."

Client B: "Here's a new version of report.docx, based on version 5"
Server:   "Current version is 6 now, not 5. Conflict detected."

On conflict, the server keeps Client A's version as the canonical file and creates a conflict copy: report (Alice's conflicted copy 2026-04-18).docx. Both versions are preserved. The users resolve it manually.

Vector clocks for conflict detection: For more sophisticated conflict detection across multiple devices, vector clocks track the causal ordering of edits. Each device maintains a vector of logical timestamps. If neither device's vector dominates the other (some entries are greater in each), a true conflict exists. If one vector dominates, it is a simple fast-forward update. Amazon Dynamo and Riak use this approach.

I'd use version numbers (Dropbox's approach) in an interview -- it is simpler and maps directly to the metadata service's cursor system. Mention vector clocks to show you know the theoretical foundation, but do not over-design it.

Deep Dive 3: Sync Protocol and Notifications

The naive approach -- polling every 30 seconds -- creates up to 30 seconds of sync delay and wastes resources on empty polls. The real design uses a push-based notification system with cursor-based catchup.

Server-side journal: The metadata service maintains an append-only journal for each namespace (user's file space or shared folder). Each entry records: operation type (create/modify/delete), file path, new blocklist hash, timestamp, and a monotonically increasing cursor (sequence number).

Notification flow:

  1. File change is committed to the metadata database.
  2. Metadata service writes a journal entry with cursor N.
  3. Notification server pushes a lightweight message to all subscribed clients: { namespace_id, new_cursor: N }.
  4. Each client calls the metadata service: "Give me journal entries since cursor M" (M being the client's last processed cursor).
  5. Client processes each journal entry, downloads new chunks if needed, and updates its local cursor to N.

Why cursors instead of timestamps? Timestamps are unreliable across distributed systems (clock skew). Cursors are monotonically increasing integers -- they provide a total ordering of events within a namespace. A client at cursor 100 knows it has seen everything up to and including event 100, regardless of wall-clock time.

WebSocket + long-poll hybrid: Dropbox uses WebSockets as the primary notification channel, with periodic polling (every few minutes) as a fallback. On reconnect after a disconnect, the client uses cursor-based catchup to fetch everything it missed. The cursor makes this idempotent and crash-safe.

Notification fan-out at scale: A change to a file in a shared folder with 100 members requires notifying 100 clients, potentially on 100 different WebSocket connection servers. The notification service needs a routing layer (backed by Redis) that maps user IDs to connection server instances.

Alternative Designs

Alternative 1: Event Sourcing with Kafka

Instead of a custom journal, use Kafka as the event log. Each namespace gets a Kafka partition. File changes are published as events. Clients consume events from their last committed offset (equivalent to cursor).

Pros: mature ecosystem, built-in retention and replay, handles fan-out naturally with consumer groups.

Cons: operational complexity of running Kafka, higher latency than direct WebSocket push (~100ms vs ~10ms), partition count limits if you have millions of namespaces.

Alternative 2: Operational Transform / CRDT for Real-Time Collaboration

If the interviewer pushes toward Google Docs-style real-time co-editing (not just file sync), you'd need to move from file-level sync to operation-level sync using Operational Transform or CRDTs. This is a fundamentally different problem -- Dropbox syncs entire files (or chunks of files), while Google Docs syncs individual character insertions and deletions.

Mention this distinction to show you understand the boundary. Then stay on the file-sync side unless the interviewer explicitly asks you to cross it.

Aspect Cursor-Based Journal Kafka Event Log OT/CRDT
Sync granularity File/chunk level File/chunk level Character/operation level
Latency ~50ms (WebSocket push) ~100ms (consumer poll) ~20ms (real-time collab)
Conflict handling Conflict copies Conflict copies Automatic merge
Implementation effort Medium Medium Very high
Best for File sync (Dropbox) File sync at extreme scale Real-time editors (Docs)

Scaling Math Verification

Upload bandwidth (20 TB/day after delta sync):

  • 20 TB / 86,400 sec = ~231 MB/sec = ~1.85 Gbps sustained
  • A cluster of 10 block servers with 1 Gbps each handles this with headroom.
  • S3 handles arbitrary write throughput. This is not the bottleneck.

Metadata operations (10M file changes/day):

  • 10M / 86,400 = ~116 writes/sec to the metadata database
  • Plus reads for sync checks: ~10x = 1,160 reads/sec
  • A sharded MySQL cluster handles this easily. Dropbox uses sharded MySQL for metadata.

WebSocket connections (100M DAU, ~3 devices each = 300M potential connections):

  • Not all online simultaneously. Assume 10% concurrent = 30M connections.
  • Each WebSocket server handles ~50,000 concurrent connections.
  • 30M / 50,000 = 600 WebSocket servers.
  • This is a significant cluster but well within the realm of standard infrastructure.

Dedup savings:

  • 50 TB raw upload per day, 60% bandwidth saved through delta sync + dedup = 20 TB actual storage/day
  • Annual new storage: 20 TB * 365 = 7.3 PB/year
  • With erasure coding (1.5x overhead vs 3x for replication): 11 PB/year of raw disk

Failure Analysis

What breaks at 10x (1B users, 1B DAU)?

Component Current At 10x Breaks? Fix
S3 / blob storage ~800 PB ~8 EB Cost Build custom storage (Magic Pocket)
Metadata DB 50 TB, sharded 500 TB Maybe More shards, DynamoDB for some tables
WebSocket servers 600 servers 6,000 servers Yes Regional clusters, smarter routing
Block server cluster 10 servers 100 servers No Linear scaling
Notification fan-out 500/sec 5,000/sec No Kafka for fan-out
Dedup effectiveness 20-30% savings Higher (more overlap) No Gets better at scale

The first thing to break at 10x is the cost of S3. This is exactly what happened at Dropbox -- they built Magic Pocket because S3 was costing them tens of millions per year. The second thing is the WebSocket infrastructure: 6,000 servers for persistent connections requires careful regional deployment and connection draining during rollouts.

What if the metadata database goes down?

Users cannot sync, but they can continue working on local files. This is a key UX property of file sync: the local filesystem is the source of truth for the user. When the metadata service recovers, clients catch up using cursor-based sync. No data is lost because the client's local state is complete.

What if S3 goes down?

Existing synced files are still available on local devices. New uploads queue locally and retry. Downloads of files not yet synced to the local device fail. Cross-region replication (or Magic Pocket's multi-DC setup) provides durability even if an entire region goes down.

What's Expected at Each Level

Aspect Mid-Level Senior Staff+
File upload Upload whole file to server, store in S3 Chunking for resumability, presigned URLs Content-defined chunking with Rabin fingerprint, delta sync
Metadata vs content Single database for everything Separate metadata DB and blob store Explain why: different consistency, scaling, failure domains
Sync mechanism Polling Long polling or WebSocket push Cursor-based journal, WebSocket + polling hybrid, fan-out
Deduplication Not mentioned Mention content-addressable storage Block-level dedup, cross-user dedup, garbage collection
Conflict resolution "Last write wins" Detect conflicts, create conflict copies Version vectors, parent-version tracking, conflict copy UX
Durability "Store in S3" Replication, backups Erasure coding, bit-rot scrubbing, cross-DC replication
Security "Use HTTPS" Encryption at rest, presigned URL expiry Zero-knowledge trade-offs (breaks dedup, sharing, search)
Scale awareness "Shard the database" Identify what breaks first at 10x Build-vs-buy storage (Dropbox left S3), operational cost

The key signal interviewers look for: does the candidate understand that the hard problem is synchronization (chunking + delta sync + conflict resolution + notifications), not CRUD operations on files?


References from Our Courses


Red Team This Design

Ready to stress-test this architecture? The Attack companion tears apart every decision in this design — from hardware physics to security holes to what actually happens at 10x scale.

Attack: Design a Cloud File Sync Service →