Async, Caching, and Concurrency

TL;DR

Not every API call completes instantly. Long-running operations use 202 Accepted with a status polling endpoint. ETags enable optimistic concurrency (preventing lost updates) and HTTP caching (avoiding redundant downloads). These are production-critical patterns that separate toy APIs from real ones.

Async operation polling pattern

Asynchronous Operations

The Problem: Long-Running Requests

Some operations take seconds, minutes, or even hours to complete. Think about:

Video processing: User uploads a video, server needs to transcode it into multiple resolutions
Report generation: User requests a CSV export of 10 million records
Bulk operations: Admin triggers a mass email to 500,000 users
Payment processing: Credit card charge needs to go through fraud detection, bank approval, and settlement

If you make the client wait synchronously, the HTTP connection will time out, the user will think it's broken, and they'll retry — potentially triggering the operation twice.

The Pattern: 202 Accepted + Status Polling

The solution is a two-step dance:

Step 1: Accept the request and return immediately

POST /reports
{
  "type": "sales_summary",
  "date_range": { "from": "2024-01-01", "to": "2024-12-31" }
}

HTTP/1.1 202 Accepted
Location: /jobs/report-abc123
Content-Type: application/json

{
  "job_id": "report-abc123",
  "status": "accepted",
  "status_url": "/jobs/report-abc123",
  "estimated_completion": "2024-03-15T14:35:00Z"
}

Key elements: - 202 Accepted — not 200 OK or 201 Created. The request was accepted for processing, not completed. - Location header — points to a status endpoint the client can poll. - Estimated completion — optional but helpful for UX (progress bars, "check back in 5 minutes").

Step 2: Client polls the status endpoint

GET /jobs/report-abc123

While in progress:

HTTP/1.1 200 OK
{
  "job_id": "report-abc123",
  "status": "processing",
  "progress": 65,
  "started_at": "2024-03-15T14:30:00Z"
}

When complete:

HTTP/1.1 303 See Other
Location: /reports/final-xyz789

{
  "job_id": "report-abc123",
  "status": "completed",
  "result_url": "/reports/final-xyz789"
}

The 303 See Other with a Location header redirects the client to the actual result. Alternatively, some APIs return 200 with the result embedded directly.

If the operation fails:

HTTP/1.1 200 OK
{
  "job_id": "report-abc123",
  "status": "failed",
  "error": {
    "code": "DATA_TOO_LARGE",
    "message": "Report exceeds maximum size. Try narrowing the date range.",
    "retryable": true
  }
}

Alternatives to Polling

Polling works but wastes requests. Better options exist:

Approach	How It Works	Best For
Polling	Client periodically calls GET on status URL	Simple, works everywhere
Webhooks	Server calls a client-provided URL when done	Server-to-server, Twilio/Stripe style
WebSockets	Server pushes status updates over persistent connection	Real-time UIs, progress bars
SSE	Server streams status updates over HTTP	Browser-friendly progress updates

Interview Tip

Mentioning the 202 + polling pattern for operations like "generate a report" or "process a video upload" shows you've built production systems. Most candidates only think in synchronous request-response.

Real-World Example: AWS S3 Multipart Upload

AWS S3's multipart upload is a great example of async API design:

POST /uploads → initiates upload, returns upload_id
PUT /uploads/{upload_id}/parts/{part_number} → upload each chunk
POST /uploads/{upload_id}/complete → finalize the upload
Server processes and assembles parts asynchronously

Each step is its own API call, the client can retry individual parts, and the overall operation is resilient to network failures.

ETags and Optimistic Concurrency

The Problem: Lost Updates

Imagine two editors working on the same document through an API:

Alice: GET /documents/1 → sees version with "Hello"
Bob: GET /documents/1 → sees version with "Hello"
Alice: PUT /documents/1 → changes to "Hello World"
Bob: PUT /documents/1 → changes to "Hello Bob"

Bob's update overwrites Alice's change without knowing it existed. This is the lost update problem.

The Solution: ETags (Entity Tags)

An ETag is an opaque string that represents a specific version of a resource. Think of it as a fingerprint — it changes whenever the resource changes.

GET /documents/1

HTTP/1.1 200 OK
ETag: "a1b2c3d4"
Content-Type: application/json

{
  "id": 1,
  "title": "My Document",
  "content": "Hello"
}

The ETag "a1b2c3d4" is typically a hash of the resource content or a version number. The client caches this ETag.

Optimistic Concurrency with If-Match

When updating, the client sends the ETag it received using the If-Match header:

PUT /documents/1
If-Match: "a1b2c3d4"
Content-Type: application/json

{
  "title": "My Document",
  "content": "Hello World"
}

The server checks: does the current ETag match "a1b2c3d4"?

If yes — the resource hasn't changed since the client last read it. Safe to update.

HTTP/1.1 200 OK
ETag: "e5f6g7h8"

{
  "id": 1,
  "title": "My Document",
  "content": "Hello World"
}

If no — someone else modified it. Return 412 Precondition Failed.

HTTP/1.1 412 Precondition Failed
Content-Type: application/json

{
  "error": {
    "code": "CONFLICT",
    "message": "The resource was modified by another client. Please re-fetch and retry.",
    "current_etag": "x9y0z1a2"
  }
}

Now the client knows it has stale data. It can re-fetch, show the user the conflict, and retry.

Why "Optimistic"?

It's called optimistic concurrency because it assumes conflicts are rare. Instead of locking the resource before editing (pessimistic), it lets everyone read and write freely, then detects conflicts at write time.

This is perfect for APIs because: - No locks — no risk of deadlocks or forgotten locks - High throughput — reads are never blocked - Stateless — the server doesn't track who's editing what - Conflict detection — you find out immediately if something went wrong

ETag Types

Type	How It's Generated	Use Case
Strong ETag	Hash of entire resource content (`"a1b2c3"`)	Byte-for-byte identity, safe for Range requests
Weak ETag	Semantic version (`W/"v42"`)	Equivalent content, may differ in formatting

Most APIs use strong ETags for simplicity.

HTTP Caching

The Problem: Redundant Data Transfer

A mobile app showing event details might call GET /events/123 every time the user opens the screen. If the event hasn't changed, the server is doing work and sending bytes for nothing.

Cache-Control Headers

The server tells clients how to cache responses using the Cache-Control header:

GET /events/123

HTTP/1.1 200 OK
Cache-Control: max-age=600, private
Content-Type: application/json
ETag: "abc123"

{ "id": 123, "name": "Taylor Swift Concert", ... }

Directive	Meaning
`max-age=600`	Cache this response for 600 seconds (10 minutes)
`private`	Only the requesting client can cache it (not CDNs/proxies)
`public`	CDNs and proxies can cache it too
`no-store`	Never cache this response at all
`no-cache`	You can cache it, but must revalidate with the server before using it

Common Gotcha

no-cache does NOT mean "don't cache." It means "always check with the server first." Use no-store if you truly want no caching (e.g., for sensitive data).

Conditional Requests with If-None-Match

After the cached response expires, the client can check if the resource actually changed using the ETag:

GET /events/123
If-None-Match: "abc123"

If the resource hasn't changed:

HTTP/1.1 304 Not Modified
ETag: "abc123"
Cache-Control: max-age=600, private

No response body — the client uses its cached copy. This saves bandwidth and server processing time.

If the resource has changed:

HTTP/1.1 200 OK
ETag: "def456"
Cache-Control: max-age=600, private

{ "id": 123, "name": "Taylor Swift Concert - SOLD OUT", ... }

New data with a new ETag.

Caching Strategy by Resource Type

Resource Type	Cache Strategy	Why
Static assets (images, CSS)	`public, max-age=31536000`	Rarely change, safe to cache aggressively
User-specific data	`private, max-age=300`	Only for this user, moderate freshness
Real-time data (stock prices)	`no-store`	Stale data is dangerous
Event listings	`public, max-age=60`	Changes sometimes, CDN-cacheable
Auth tokens	`no-store`	Never cache sensitive credentials

Interview Tip

Mentioning Cache-Control headers and ETags shows you think about performance at the HTTP layer, not just application logic. This is especially relevant for CDN-heavy architectures.

Content Negotiation

When clients and servers need to agree on data format, they use content negotiation headers.

How It Works

The client says what it can accept:

GET /events/123
Accept: application/json, application/xml;q=0.9

The q value (0-1) indicates preference. JSON is preferred (q=1.0 by default), XML is acceptable (q=0.9).

The server responds in the best matching format:

HTTP/1.1 200 OK
Content-Type: application/json

{ "id": 123, "name": "Concert" }

When Negotiation Fails

Status Code	Meaning
406 Not Acceptable	Server can't produce any format the client accepts
415 Unsupported Media Type	Server doesn't understand the format the client sent

POST /events
Content-Type: application/xml    ← Server only accepts JSON

HTTP/1.1 415 Unsupported Media Type
{
  "error": "This API only accepts application/json"
}

In Practice

Most modern APIs only support JSON, making content negotiation less relevant than it once was. But it still matters for:

APIs that support both JSON and XML (enterprise/government systems)
APIs that serve different representations (HTML for browsers, JSON for APIs)
Binary resources (images in different formats: JPEG, PNG, WebP)

JSON Patch vs JSON Merge Patch

When we talked about PATCH in Chapter 2, we said it does "partial updates." But how does the client describe those partial updates? There are two standard formats.

JSON Merge Patch (RFC 7396)

The simpler approach. Send a JSON object with only the fields you want to change:

PATCH /users/123
Content-Type: application/merge-patch+json

{
  "email": "new@example.com",
  "phone": null
}

This sets email to the new value and deletes phone (null = remove). Fields not included are unchanged.

Limitation: You can't set a field to null without deleting it, because null means "remove." If your data model uses null values meaningfully, merge patch won't work.

JSON Patch (RFC 6902)

The more powerful approach. Send an array of operations:

PATCH /users/123
Content-Type: application/json-patch+json

[
  { "op": "replace", "path": "/email", "value": "new@example.com" },
  { "op": "remove", "path": "/phone" },
  { "op": "add", "path": "/addresses/1", "value": { "city": "NYC" } },
  { "op": "test", "path": "/version", "value": 5 }
]

Operations: add, remove, replace, move, copy, test.

The test operation is especially useful — it checks a precondition before applying changes, similar to optimistic concurrency.

Which to Use?

Format	Simplicity	Power	Null Handling	Use When
Merge Patch	Very simple	Limited	null = delete	Simple field updates
JSON Patch	More complex	Full	Explicit operations	Complex mutations, arrays, conditions

For interviews, merge patch is usually sufficient. Mention JSON Patch if the interviewer asks about complex partial updates or array manipulation.

Don't Mirror Your Database

This is a principle from Microsoft's API design guide that's worth calling out explicitly.

The anti-pattern:

# Database tables: users, user_addresses, user_preferences
GET /users
GET /user_addresses?user_id=123
GET /user_preferences?user_id=123

Why it's wrong: Your API is a leaky abstraction of your database schema. If you rename a table or split it, every client breaks. You're also exposing internal structure that attackers can exploit.

The fix: Design the API around business entities, not database tables:

GET /users/123
{
  "id": 123,
  "name": "Jane Doe",
  "address": { "city": "NYC", "zip": "10001" },
  "preferences": { "theme": "dark", "notifications": true }
}

The API is an abstraction layer between clients and your database. You're free to change your storage implementation without breaking the API contract.

OpenAPI / Swagger

In production, APIs are documented using the OpenAPI Specification (formerly Swagger). It's a machine-readable YAML/JSON file that describes your entire API:

openapi: 3.0.0
info:
  title: Ticketmaster API
  version: 1.0.0
paths:
  /events:
    get:
      summary: List events
      parameters:
        - name: city
          in: query
          schema:
            type: string
      responses:
        '200':
          description: List of events
          content:
            application/json:
              schema:
                type: array
                items:
                  $ref: '#/components/schemas/Event'

Why it matters: - Auto-generated documentation — tools like Swagger UI create interactive docs from the spec - Client SDK generation — generate TypeScript, Python, Java clients automatically - Contract-first development — teams agree on the API spec before writing code - Testing — validate requests/responses against the spec

You don't need to write OpenAPI specs in interviews, but mentioning it shows you know how APIs are documented and maintained in real organizations.

Interview Tip

If asked "how would you document this API?", saying "I'd define an OpenAPI spec that teams can use to auto-generate client SDKs and interactive documentation" is a strong answer.

Key Takeaways

Pattern	When to Use	Status Code
Async operations	Video processing, report generation, bulk tasks	202 Accepted → poll → 303 See Other
Optimistic concurrency	Collaborative editing, preventing lost updates	If-Match → 412 Precondition Failed
HTTP caching	Reducing redundant data transfer	Cache-Control → If-None-Match → 304 Not Modified
Content negotiation	Supporting multiple response formats	Accept header → 406 / 415
JSON Patch	Complex partial updates	PATCH with application/json-patch+json
OpenAPI	API documentation and contracts	N/A — it's a spec, not a runtime pattern

Interview Expectations: Junior vs. Senior

Junior/Mid-level: Designs synchronous APIs entirely. Mentions caching via Redis but might not explain cache-control headers (ETag, max-age).
Senior/Staff: Proactively makes heavy/slow operations asynchronous (HTTP 202 Accepted + Polling/Webhooks). Uses Optimistic Concurrency Control (ETags) or Pessimistic Locking to handle write conflicts. Discusses CDN caching strategies for read-heavy public APIs.