The Problem with Polling
TL;DR
HTTP is request-response: the server literally cannot speak first. Polling fakes real-time by hammering the server with "anything new?" requests. It works, but the math is brutal — 1M users polling every 5 seconds is 200K req/s of mostly empty responses. Long polling improves things by holding the connection open, but you're still paying one pending request per client. When you need true real-time, you need push.
HTTP's Fundamental Limitation
Here's the thing most people forget: HTTP was designed for documents, not conversations.
The protocol is strictly request-response. The client speaks, the server answers, the connection is done. There is no mechanism for the server to wake up and say "hey, something changed." It's like a walkie-talkie where only one side has the talk button.
Client: "Got anything new?"
Server: "Nope."
Client: "Got anything new?"
Server: "Nope."
Client: "Got anything new?"
Server: "Yes! Here's a message from 4.8 seconds ago."
That delay between the event happening and the client discovering it? That's the polling interval, and it's the core tradeoff of every polling strategy.
Simple Polling — The Brute Force Approach
The simplest solution: set a timer, ask the server on every tick.
// The most expensive setInterval you'll ever write
setInterval(async () => {
const res = await fetch('/api/notifications');
const data = await res.json();
if (data.length > 0) {
renderNotifications(data);
}
}, 5000); // every 5 seconds

It works. Now let's talk about why it doesn't scale.
The Math That Kills Polling
Let's say you're building a chat app. You have 1 million connected users, each polling every 5 seconds.
| Metric | Value |
|---|---|
| Users | 1,000,000 |
| Poll interval | 5 seconds |
| Requests per second | 200,000 req/s |
| Responses with actual data | ~1-5% |
| Wasted requests | 190,000-198,000 req/s |
| Bandwidth per empty response | ~200 bytes (headers alone) |
| Wasted bandwidth | ~38 MB/s |
That's 200K queries per second against your database, 95%+ of which return nothing. You're paying for servers, bandwidth, and database IOPS to answer "no" two hundred thousand times a second.
The Hidden Cost
Each poll request isn't free even when empty. HTTP headers alone are 200-800 bytes. TLS handshake if connections aren't reused. Load balancer processing. Auth token verification. Database query (even if it returns zero rows). Multiply by 200K/s and you have a very expensive way to say "nothing happened."
And the kicker? You still have up to 5 seconds of latency. A message sent at t=0.1s won't be seen until the next poll at t=5.0s. Shorten the interval to reduce latency and you multiply the load proportionally.
Long Polling — A Clever Hack
Long polling flips the model: instead of returning immediately, the server holds the request open until there's data or a timeout.
# Server-side long polling endpoint
@app.route('/api/messages/poll')
def long_poll():
timeout = 30 # seconds
start = time.time()
while time.time() - start < timeout:
messages = db.get_new_messages(since=request.args['since'])
if messages:
return jsonify(messages)
time.sleep(0.5) # check every 500ms
return jsonify([]) # timeout, client should reconnect
// Client-side: reconnect loop
async function longPoll(since) {
try {
const res = await fetch(`/api/messages/poll?since=${since}`);
const messages = await res.json();
if (messages.length > 0) {
renderMessages(messages);
since = messages[messages.length - 1].timestamp;
}
} finally {
longPoll(since); // immediately reconnect
}
}

Why It's Better
- Near-instant delivery: when data arrives, the response fires immediately
- Fewer empty responses: most responses carry actual data
- Simple: works with any HTTP infrastructure, no special protocols
Why It Still Hurts
| Problem | Impact |
|---|---|
| One held connection per client | 1M users = 1M pending HTTP requests |
| Server thread/process per connection | Blocks a worker in thread-per-request servers |
| Thundering herd | If server restarts, all clients reconnect simultaneously |
| Load balancer timeouts | Many LBs kill connections at 30-60s, causing spurious reconnects |
| No multiplexing | Each "channel" you want to listen to needs its own long-poll |
Interview Tip
When an interviewer says "how would you make this real-time?" don't jump straight to WebSockets. Start with polling, explain why it breaks at scale, walk through long polling, and then motivate the move to push protocols. This shows you understand the tradeoff space, not just the final answer.
When Polling Is Actually Fine
Polling gets a bad reputation, but it's the right call more often than people think.
Polling wins when:
- Low frequency updates: a dashboard that refreshes every 60 seconds doesn't need push infrastructure
- Small user base: 500 users polling every 10s is 50 req/s — any server handles that
- Stateless simplicity: no connection state to manage, no reconnection logic, no sticky sessions
- Cache-friendly data: if the response is the same for all users, slap a CDN in front and polling costs almost nothing
- Backend jobs: checking if a report finished generating — poll every 2s with exponential backoff
# Exponential backoff makes simple polling surprisingly effective
intervals = [1, 2, 4, 8, 16, 30] # seconds
async def poll_with_backoff(job_id):
for interval in intervals:
status = await check_job(job_id)
if status == 'complete':
return await get_result(job_id)
await sleep(interval)
raise TimeoutError("Job took too long")
When You Need Push
The moment any of these are true, polling starts costing more than it's worth:
| Signal | Example | Why Polling Fails |
|---|---|---|
| Sub-second latency required | Chat messages, live cursors | Poll interval directly adds latency |
| High connection count | >10K concurrent users | 10K+ req/s of empty responses |
| High event frequency | Stock tickers, game state | Poll interval can't keep up |
| Bidirectional communication | Collaborative editing | Polling is one-way by nature |
| Battery/bandwidth constraints | Mobile apps | Constant polling drains battery |
Slack's early architecture started with long polling and migrated to WebSockets as they scaled past tens of thousands of concurrent connections per workspace. The polling infrastructure was consuming more compute than the actual message processing.
The Spectrum of Real-Time
Think of it as a gradient, not a binary choice:

Each step to the right buys you lower latency and more capability but adds operational complexity. The next lesson covers SSE, WebSockets, and WebRTC — the actual push protocols that solve the problems we've outlined here.
Key Takeaways
| Concept | Remember |
|---|---|
| HTTP can't push | Server literally cannot initiate communication |
| Simple polling math | Users / interval = req/s, 95%+ are wasted |
| Long polling | Holds connection until data or timeout; better but still 1 request per client |
| Polling is fine for | Low-frequency, small-scale, cache-friendly scenarios |
| Push is needed for | Sub-second latency, high concurrency, bidirectional communication |
| Exponential backoff | Makes simple polling viable for job status checks |
Common Interview Mistake
Don't say "polling doesn't scale." It scales fine for the right use cases. The real answer is: "polling's per-request overhead makes it cost-prohibitive when you need low-latency delivery to many concurrent users." Precision matters.