Skip to content

Async, Caching, and Concurrency

TL;DR

Not every API call completes instantly. Long-running operations use 202 Accepted with a status polling endpoint. ETags enable optimistic concurrency (preventing lost updates) and HTTP caching (avoiding redundant downloads). These are production-critical patterns that separate toy APIs from real ones.

Async operation polling pattern


Asynchronous Operations

The Problem: Long-Running Requests

Some operations take seconds, minutes, or even hours to complete. Think about:

  • Video processing: User uploads a video, server needs to transcode it into multiple resolutions
  • Report generation: User requests a CSV export of 10 million records
  • Bulk operations: Admin triggers a mass email to 500,000 users
  • Payment processing: Credit card charge needs to go through fraud detection, bank approval, and settlement

If you make the client wait synchronously, the HTTP connection will time out, the user will think it's broken, and they'll retry — potentially triggering the operation twice.

The Pattern: 202 Accepted + Status Polling

The solution is a two-step dance:

Step 1: Accept the request and return immediately

POST /reports
{
  "type": "sales_summary",
  "date_range": { "from": "2024-01-01", "to": "2024-12-31" }
}
HTTP/1.1 202 Accepted
Location: /jobs/report-abc123
Content-Type: application/json

{
  "job_id": "report-abc123",
  "status": "accepted",
  "status_url": "/jobs/report-abc123",
  "estimated_completion": "2024-03-15T14:35:00Z"
}

Key elements: - 202 Accepted — not 200 OK or 201 Created. The request was accepted for processing, not completed. - Location header — points to a status endpoint the client can poll. - Estimated completion — optional but helpful for UX (progress bars, "check back in 5 minutes").

Step 2: Client polls the status endpoint

GET /jobs/report-abc123

While in progress:

HTTP/1.1 200 OK
{
  "job_id": "report-abc123",
  "status": "processing",
  "progress": 65,
  "started_at": "2024-03-15T14:30:00Z"
}

When complete:

HTTP/1.1 303 See Other
Location: /reports/final-xyz789

{
  "job_id": "report-abc123",
  "status": "completed",
  "result_url": "/reports/final-xyz789"
}

The 303 See Other with a Location header redirects the client to the actual result. Alternatively, some APIs return 200 with the result embedded directly.

If the operation fails:

HTTP/1.1 200 OK
{
  "job_id": "report-abc123",
  "status": "failed",
  "error": {
    "code": "DATA_TOO_LARGE",
    "message": "Report exceeds maximum size. Try narrowing the date range.",
    "retryable": true
  }
}

Alternatives to Polling

Polling works but wastes requests. Better options exist:

Approach How It Works Best For
Polling Client periodically calls GET on status URL Simple, works everywhere
Webhooks Server calls a client-provided URL when done Server-to-server, Twilio/Stripe style
WebSockets Server pushes status updates over persistent connection Real-time UIs, progress bars
SSE Server streams status updates over HTTP Browser-friendly progress updates

Interview Tip

Mentioning the 202 + polling pattern for operations like "generate a report" or "process a video upload" shows you've built production systems. Most candidates only think in synchronous request-response.

Real-World Example: AWS S3 Multipart Upload

AWS S3's multipart upload is a great example of async API design:

  1. POST /uploads → initiates upload, returns upload_id
  2. PUT /uploads/{upload_id}/parts/{part_number} → upload each chunk
  3. POST /uploads/{upload_id}/complete → finalize the upload
  4. Server processes and assembles parts asynchronously

Each step is its own API call, the client can retry individual parts, and the overall operation is resilient to network failures.


ETags and Optimistic Concurrency

The Problem: Lost Updates

Imagine two editors working on the same document through an API:

  1. Alice: GET /documents/1 → sees version with "Hello"
  2. Bob: GET /documents/1 → sees version with "Hello"
  3. Alice: PUT /documents/1 → changes to "Hello World"
  4. Bob: PUT /documents/1 → changes to "Hello Bob"

Bob's update overwrites Alice's change without knowing it existed. This is the lost update problem.

The Solution: ETags (Entity Tags)

An ETag is an opaque string that represents a specific version of a resource. Think of it as a fingerprint — it changes whenever the resource changes.

GET /documents/1

HTTP/1.1 200 OK
ETag: "a1b2c3d4"
Content-Type: application/json

{
  "id": 1,
  "title": "My Document",
  "content": "Hello"
}

The ETag "a1b2c3d4" is typically a hash of the resource content or a version number. The client caches this ETag.

Optimistic Concurrency with If-Match

When updating, the client sends the ETag it received using the If-Match header:

PUT /documents/1
If-Match: "a1b2c3d4"
Content-Type: application/json

{
  "title": "My Document",
  "content": "Hello World"
}

The server checks: does the current ETag match "a1b2c3d4"?

If yes — the resource hasn't changed since the client last read it. Safe to update.

HTTP/1.1 200 OK
ETag: "e5f6g7h8"

{
  "id": 1,
  "title": "My Document",
  "content": "Hello World"
}

If no — someone else modified it. Return 412 Precondition Failed.

HTTP/1.1 412 Precondition Failed
Content-Type: application/json

{
  "error": {
    "code": "CONFLICT",
    "message": "The resource was modified by another client. Please re-fetch and retry.",
    "current_etag": "x9y0z1a2"
  }
}

Now the client knows it has stale data. It can re-fetch, show the user the conflict, and retry.

Why "Optimistic"?

It's called optimistic concurrency because it assumes conflicts are rare. Instead of locking the resource before editing (pessimistic), it lets everyone read and write freely, then detects conflicts at write time.

This is perfect for APIs because: - No locks — no risk of deadlocks or forgotten locks - High throughput — reads are never blocked - Stateless — the server doesn't track who's editing what - Conflict detection — you find out immediately if something went wrong

ETag Types

Type How It's Generated Use Case
Strong ETag Hash of entire resource content ("a1b2c3") Byte-for-byte identity, safe for Range requests
Weak ETag Semantic version (W/"v42") Equivalent content, may differ in formatting

Most APIs use strong ETags for simplicity.


HTTP Caching

The Problem: Redundant Data Transfer

A mobile app showing event details might call GET /events/123 every time the user opens the screen. If the event hasn't changed, the server is doing work and sending bytes for nothing.

Cache-Control Headers

The server tells clients how to cache responses using the Cache-Control header:

GET /events/123

HTTP/1.1 200 OK
Cache-Control: max-age=600, private
Content-Type: application/json
ETag: "abc123"

{ "id": 123, "name": "Taylor Swift Concert", ... }
Directive Meaning
max-age=600 Cache this response for 600 seconds (10 minutes)
private Only the requesting client can cache it (not CDNs/proxies)
public CDNs and proxies can cache it too
no-store Never cache this response at all
no-cache You can cache it, but must revalidate with the server before using it

Common Gotcha

no-cache does NOT mean "don't cache." It means "always check with the server first." Use no-store if you truly want no caching (e.g., for sensitive data).

Conditional Requests with If-None-Match

After the cached response expires, the client can check if the resource actually changed using the ETag:

GET /events/123
If-None-Match: "abc123"

If the resource hasn't changed:

HTTP/1.1 304 Not Modified
ETag: "abc123"
Cache-Control: max-age=600, private

No response body — the client uses its cached copy. This saves bandwidth and server processing time.

If the resource has changed:

HTTP/1.1 200 OK
ETag: "def456"
Cache-Control: max-age=600, private

{ "id": 123, "name": "Taylor Swift Concert - SOLD OUT", ... }

New data with a new ETag.

Caching Strategy by Resource Type

Resource Type Cache Strategy Why
Static assets (images, CSS) public, max-age=31536000 Rarely change, safe to cache aggressively
User-specific data private, max-age=300 Only for this user, moderate freshness
Real-time data (stock prices) no-store Stale data is dangerous
Event listings public, max-age=60 Changes sometimes, CDN-cacheable
Auth tokens no-store Never cache sensitive credentials

Interview Tip

Mentioning Cache-Control headers and ETags shows you think about performance at the HTTP layer, not just application logic. This is especially relevant for CDN-heavy architectures.


Content Negotiation

When clients and servers need to agree on data format, they use content negotiation headers.

How It Works

The client says what it can accept:

GET /events/123
Accept: application/json, application/xml;q=0.9

The q value (0-1) indicates preference. JSON is preferred (q=1.0 by default), XML is acceptable (q=0.9).

The server responds in the best matching format:

HTTP/1.1 200 OK
Content-Type: application/json

{ "id": 123, "name": "Concert" }

When Negotiation Fails

Status Code Meaning
406 Not Acceptable Server can't produce any format the client accepts
415 Unsupported Media Type Server doesn't understand the format the client sent
POST /events
Content-Type: application/xml    ← Server only accepts JSON

HTTP/1.1 415 Unsupported Media Type
{
  "error": "This API only accepts application/json"
}

In Practice

Most modern APIs only support JSON, making content negotiation less relevant than it once was. But it still matters for:

  • APIs that support both JSON and XML (enterprise/government systems)
  • APIs that serve different representations (HTML for browsers, JSON for APIs)
  • Binary resources (images in different formats: JPEG, PNG, WebP)

JSON Patch vs JSON Merge Patch

When we talked about PATCH in Chapter 2, we said it does "partial updates." But how does the client describe those partial updates? There are two standard formats.

JSON Merge Patch (RFC 7396)

The simpler approach. Send a JSON object with only the fields you want to change:

PATCH /users/123
Content-Type: application/merge-patch+json

{
  "email": "new@example.com",
  "phone": null
}

This sets email to the new value and deletes phone (null = remove). Fields not included are unchanged.

Limitation: You can't set a field to null without deleting it, because null means "remove." If your data model uses null values meaningfully, merge patch won't work.

JSON Patch (RFC 6902)

The more powerful approach. Send an array of operations:

PATCH /users/123
Content-Type: application/json-patch+json

[
  { "op": "replace", "path": "/email", "value": "new@example.com" },
  { "op": "remove", "path": "/phone" },
  { "op": "add", "path": "/addresses/1", "value": { "city": "NYC" } },
  { "op": "test", "path": "/version", "value": 5 }
]

Operations: add, remove, replace, move, copy, test.

The test operation is especially useful — it checks a precondition before applying changes, similar to optimistic concurrency.

Which to Use?

Format Simplicity Power Null Handling Use When
Merge Patch Very simple Limited null = delete Simple field updates
JSON Patch More complex Full Explicit operations Complex mutations, arrays, conditions

For interviews, merge patch is usually sufficient. Mention JSON Patch if the interviewer asks about complex partial updates or array manipulation.


Don't Mirror Your Database

This is a principle from Microsoft's API design guide that's worth calling out explicitly.

The anti-pattern:

# Database tables: users, user_addresses, user_preferences
GET /users
GET /user_addresses?user_id=123
GET /user_preferences?user_id=123

Why it's wrong: Your API is a leaky abstraction of your database schema. If you rename a table or split it, every client breaks. You're also exposing internal structure that attackers can exploit.

The fix: Design the API around business entities, not database tables:

GET /users/123
{
  "id": 123,
  "name": "Jane Doe",
  "address": { "city": "NYC", "zip": "10001" },
  "preferences": { "theme": "dark", "notifications": true }
}

The API is an abstraction layer between clients and your database. You're free to change your storage implementation without breaking the API contract.


OpenAPI / Swagger

In production, APIs are documented using the OpenAPI Specification (formerly Swagger). It's a machine-readable YAML/JSON file that describes your entire API:

openapi: 3.0.0
info:
  title: Ticketmaster API
  version: 1.0.0
paths:
  /events:
    get:
      summary: List events
      parameters:
        - name: city
          in: query
          schema:
            type: string
      responses:
        '200':
          description: List of events
          content:
            application/json:
              schema:
                type: array
                items:
                  $ref: '#/components/schemas/Event'

Why it matters: - Auto-generated documentation — tools like Swagger UI create interactive docs from the spec - Client SDK generation — generate TypeScript, Python, Java clients automatically - Contract-first development — teams agree on the API spec before writing code - Testing — validate requests/responses against the spec

You don't need to write OpenAPI specs in interviews, but mentioning it shows you know how APIs are documented and maintained in real organizations.

Interview Tip

If asked "how would you document this API?", saying "I'd define an OpenAPI spec that teams can use to auto-generate client SDKs and interactive documentation" is a strong answer.


Key Takeaways

Pattern When to Use Status Code
Async operations Video processing, report generation, bulk tasks 202 Accepted → poll → 303 See Other
Optimistic concurrency Collaborative editing, preventing lost updates If-Match → 412 Precondition Failed
HTTP caching Reducing redundant data transfer Cache-Control → If-None-Match → 304 Not Modified
Content negotiation Supporting multiple response formats Accept header → 406 / 415
JSON Patch Complex partial updates PATCH with application/json-patch+json
OpenAPI API documentation and contracts N/A — it's a spec, not a runtime pattern

Interview Expectations: Junior vs. Senior

  • Junior/Mid-level: Designs synchronous APIs entirely. Mentions caching via Redis but might not explain cache-control headers (ETag, max-age).
  • Senior/Staff: Proactively makes heavy/slow operations asynchronous (HTTP 202 Accepted + Polling/Webhooks). Uses Optimistic Concurrency Control (ETags) or Pessimistic Locking to handle write conflicts. Discusses CDN caching strategies for read-heavy public APIs.