Skip to content

Why Large Files Break

TL;DR

Routing file bytes through your API server turns it into a dumb, expensive pipe. A 2GB upload ties up a server thread, burns bandwidth you pay for twice, and adds latency for zero value. The fix: your server hands out a permission slip (presigned URL), and the client uploads directly to object storage. Server never touches the bytes. This chapter teaches the patterns that make it work.


Your API Server Is Not a Pipe

Here's what happens when someone uploads a profile picture through your REST API in the "obvious" way:

# The naive approach -- every byte flows through your server
@app.route("/api/upload", methods=["POST"])
def upload_file():
    file = request.files["document"]
    # Your server receives every byte...
    data = file.read()
    # ...then re-uploads every byte to S3
    s3.put_object(Bucket="my-bucket", Key=f"docs/{file.filename}", Body=data)
    return jsonify({"status": "uploaded"}), 200

For a 5KB avatar, nobody notices. For a 2GB video, your server becomes a relay station doing zero useful work while holding a thread hostage.


The Proxy Problem Visualized

Bad proxy pattern showing every byte traveling through the API server resulting in 4GB transferred and server blocked for 3 minutes

Good direct upload pattern showing client uploading directly to S3 with presigned URL, server only handles two tiny JSON requests

The server goes from processing 2GB of traffic to processing two tiny JSON requests. That's the entire lesson of this chapter in one diagram.


The Three Costs of Proxying

Cost Proxied Upload (2GB) Direct Upload (2GB)
Server thread time ~3 minutes blocked ~5ms (two JSON calls)
Bandwidth bill 2GB in + 2GB to S3 = 4GB 0 (client talks to S3)
User latency Double hop: user→server→S3 Single hop: user→S3
Concurrent uploads supported Handful before thread pool exhaustion Thousands (server does no heavy lifting)

The bandwidth cost is particularly brutal. You pay for data transfer into your server, then pay again to transfer it to S3. Direct upload costs you nothing on the server side.


Object Storage vs. Your Database

Not all data belongs in the same place. The decision is straightforward:

Storage decision flowchart for choosing between relational DB, document DB, object storage, and search index based on data characteristics

Object storage (S3, GCS, Azure Blob) is purpose-built for large binary data:

Property Object Storage (S3) Relational DB (Postgres)
Max object size 5TB ~1GB (BYTEA), painful above 100MB
Durability 99.999999999% (11 nines) Depends on your backup strategy
Cost per GB/month $0.023 $0.10 - $0.30 (EBS storage)
Designed for Write once, read many Frequent updates, joins, queries
Access pattern GET by key SQL queries across rows
Concurrent access Millions of reads Connection pool limits

The 11 nines number

S3's durability of 99.999999999% means if you store 10 million objects, you'd statistically lose one every 10,000 years. That's not a marketing number -- it's replicated across multiple facilities within a region by default.


The Rule of Thumb

If the data is larger than 10MB and you'll never run SQL queries on it -- it goes in object storage. Store a pointer (the S3 key) in your database.

# Your database row -- small, queryable metadata
{
    "id": "doc_8f3a",
    "user_id": "user_42",
    "filename": "quarterly-report.pdf",
    "content_type": "application/pdf",
    "size_bytes": 48_291_837,
    "s3_key": "documents/user_42/doc_8f3a/quarterly-report.pdf",
    "uploaded_at": "2025-03-15T10:30:00Z",
    "status": "processed"
}

# The actual file -- large, binary, stored in S3
# s3://my-bucket/documents/user_42/doc_8f3a/quarterly-report.pdf

This is the metadata-in-DB, bytes-in-storage pattern. You'll see it everywhere: profile photos, videos, documents, backups, ML model artifacts.


Why Direct Upload Matters for Global Users

When your API server is in us-east-1 and your user is in Sydney, proxying the upload means every byte crosses the Pacific Ocean twice -- once to your server, once from your server to S3.

With direct upload to a regional S3 bucket (or S3 Transfer Acceleration), the Sydney user uploads to an endpoint that routes to the nearest AWS edge location. The bytes never touch your application server.

Upload Path Sydney User Latency (2GB)
Sydney → us-east-1 API → us-east-1 S3 ~4 minutes
Sydney → ap-southeast-2 S3 (direct) ~90 seconds
Sydney → nearest edge (Transfer Acceleration) ~60 seconds

Dropbox uses direct-to-storage uploads for all file syncs, routing clients to the nearest data center rather than proxying through a central API.


The Architecture You're Building This Chapter

Here's the complete picture. Each of the next four lessons tackles one piece:

Chapter overview showing the four lesson topics: presigned URLs, chunked uploads, deduplication and processing, and storage tiers


What Goes Wrong When You Skip This

The failure modes of proxied uploads stack up fast:

Production horror stories

  • Thread pool exhaustion: 20 concurrent 1GB uploads pin all Gunicorn workers. Health checks fail. Load balancer marks the server unhealthy. All traffic reroutes. Cascading failure.
  • Memory pressure: request.files["video"].read() loads the entire file into memory. 8 concurrent 2GB uploads = 16GB RAM. OOMKilled.
  • Timeout cascade: Nginx's proxy_read_timeout kills the connection at 60s. The client retries. Now you have duplicate partial uploads and orphaned S3 objects.
  • Cost surprise: A file-sharing app proxying 10TB/day through c5.xlarge instances. The bandwidth bill alone was $900/month -- for acting as a relay.

Where This Pattern Shows Up in Interviews

Any system design question involving user-generated content will test these concepts:

Question Large File Pattern Needed
Design a file sharing service Presigned URLs, chunked uploads, dedup
Design a video streaming platform Chunked uploads, processing pipeline, CDN
Design an image hosting service Presigned URLs, processing pipeline, storage tiers
Design a collaborative document editor Direct upload, delta sync, block-level dedup
Design a backup system Chunked uploads, dedup, cold storage

The interviewer isn't testing whether you know S3 API calls. They're testing whether you understand why the server shouldn't touch the bytes and what patterns enable that.


Key Takeaways

Concept Details
Proxy problem Routing files through your server wastes bandwidth, threads, and money
Direct upload Client uploads straight to object storage; server only handles metadata
Object storage rule >10MB + no SQL queries = blob storage, not database
11 nines S3 durability: 99.999999999% -- lose 1 object per 10M every 10,000 years
Metadata pattern Store the S3 key in your DB, the bytes in S3
Global latency Direct upload eliminates the double-hop for geographically distant users