Design a Social Feed System
TL;DR
A social feed assembles a personalized timeline of posts from people you follow, sorted by relevance or recency. The core architectural decision is fan-out on write (pre-compute feeds when a post is created) vs. fan-out on read (assemble the feed when a user opens the app). Neither works alone. The "celebrity problem" -- where a user with 500 million followers creates a post and you need to write 500 million feed entries -- is the defining constraint. Twitter, Facebook, and Instagram all landed on a hybrid approach: fan-out on write for normal users, fan-out on read for celebrities. Understanding why the hybrid exists, and where the boundary is, separates a textbook answer from a real one.
The System
When you open Instagram or Twitter, the first thing you see is a feed -- a chronologically or algorithmically ordered list of posts from people you follow, mixed with recommended content. Building this feed at scale is one of the hardest engineering problems in social media.
Twitter (now X) processes approximately 500 million tweets per day across 400 million monthly active users. The median user follows ~200 accounts, but the distribution is extremely skewed: most users follow 50-300 accounts, while some follow 5,000+. On the author side, the average user has a few hundred followers, but celebrities can have 100+ million. Lady Gaga has 84 million followers on Twitter. When she tweets, that post needs to appear in 84 million feeds. Facebook's News Feed serves 2+ billion users, each with a personalized feed assembled from thousands of potential stories. Instagram processes over 100 million photos uploaded per day, each needing to land in the feeds of the uploader's followers.
Requirements
Functional
- Post creation: A user creates a post (text, image, video link) that should appear in their followers' feeds
- Feed retrieval: A user opens the app and sees a feed of recent posts from people they follow, ordered by time (or relevance)
- Pagination: Feed supports infinite scroll with cursor-based pagination (not page numbers)
- Follow/unfollow: Users can follow/unfollow others, which immediately affects their feed content
- Interactions: Users can like, comment, and share posts. These interactions may influence feed ranking but do not change who sees the post
Non-Functional
- Feed latency: Feed loads in under 200ms at p99 (users abandon apps that feel slow)
- Post visibility: A new post appears in followers' feeds within 5 seconds (fan-out on write) or at next feed refresh (fan-out on read)
- Throughput: 500M posts/day = ~5,800 posts/sec. Each post fans out to an average of 200 followers = 1.16M feed writes/sec
- Feed freshness: The feed should reflect recent posts. A feed that is 10 minutes stale is fine for algorithmic feeds; a chronological feed should be within 5 seconds
- Availability: 99.99% uptime. A down feed = a down app
Back-of-Envelope Math
Post volume:
500M posts/day = 5,800 posts/sec
Peak (2x): ~11,600 posts/sec
Fan-out writes (fan-out on write):
Average followers per user: 200
5,800 posts/sec * 200 followers = 1,160,000 feed writes/sec
Peak: 2.32M feed writes/sec
Celebrity post fan-out:
Lady Gaga posts: 84M followers
At 100K writes/sec per Redis shard: 84M / 100K = 840 seconds = 14 minutes
Just for ONE post from ONE celebrity. This is why pure fan-out on write fails.
Feed storage (per user):
Feed = list of post IDs (not full posts)
Each post ID: 8 bytes
Feed size: 500 post IDs * 8 bytes = 4 KB per user
500M users * 4 KB = 2 TB total feed storage in Redis
Feed read:
Active users reading feed: 200M daily active users
Average feed reads per user per day: 10
2B feed reads/day = 23,148 reads/sec
Peak: ~46K reads/sec
Each read: fetch 20 post IDs from Redis + hydrate with post data
The Naive Design
┌──────────┐ ┌──────────────┐ ┌──────────────┐
│ Client │────>│ API Server │────>│ Database │
│ │<────│ │<────│ (MySQL) │
└──────────┘ └──────────────┘ └──────────────┘
-- User A creates a post:
INSERT INTO posts (user_id, content, created_at) VALUES (A, '...', NOW());
-- User B opens their feed:
SELECT p.* FROM posts p
JOIN follows f ON f.following_id = p.user_id
WHERE f.follower_id = B
ORDER BY p.created_at DESC
LIMIT 20;
This is pure fan-out on read. When User B opens their feed, the database joins the follows table with the posts table to find all recent posts from people B follows. Simple, correct, no pre-computation needed.
Where Does This Break First?
The JOIN. If User B follows 500 accounts and each account has posted 1,000 times, the database scans 500,000 rows, sorts them by time, and returns the top 20. At 46K reads/sec with this query, the database dies. The query plan cannot be optimized because it requires reading from N different user partitions and merging results.
Where It Breaks
Problem 1: Fan-out on read is too slow at scale. The feed query joins follows with posts for every user on every feed load. With 500 following and 1,000 posts each, that is 500K rows sorted per query. At 46K queries/sec, the database is doing 23 billion row operations per second. No amount of indexing saves you.
Problem 2: Fan-out on write is too slow for celebrities. If you pre-compute feeds (write each post to every follower's feed list when it is created), a celebrity post with 84M followers takes 14 minutes to fan out at 100K writes/sec. During those 14 minutes, 84M users do not see the post. And if 5 celebrities tweet in the same minute, you have 5 * 84M = 420M pending writes clogging your write pipeline while normal users' posts get delayed.
Problem 3: Memory for pre-computed feeds. Storing 500M users * 500 post IDs * 8 bytes = 2 TB. This fits in Redis, but just barely. If you want to store more post IDs per user (for deeper scrolling) or add metadata (timestamps, interaction counts), memory requirements grow rapidly.
Problem 4: Feed consistency on follow/unfollow. User B follows User A. B's pre-computed feed does not include A's recent posts. Should you backfill A's posts into B's feed? If yes, how far back? If you backfill 1,000 posts for every follow action, and follow/unfollow happens 100K times/sec, you are writing 100M post IDs/sec just for follow operations.
The Real Design
┌─────────────────────────────────────┐
│ Post Service │
│ (accepts new posts, stores in DB) │
└──────────────┬──────────────────────┘
│
v
┌─────────────────────────────────────┐
│ Fan-out Service │
│ ┌──────────────────────────────┐ │
│ │ Is author a celebrity? │ │
│ │ YES: skip fan-out (read path)│ │
│ │ NO: fan-out to followers │ │
│ └──────────────────────────────┘ │
└──────────────┬──────────────────────┘
│ (normal users only)
v
┌─────────────────────────────────────┐
│ Redis Feed Cache │
│ user:B:feed -> [post_id7, post_id6, │
│ post_id5, post_id4] │
└─────────────────────────────────────┘
Feed Read Path:
┌──────────┐ ┌──────────────┐
│ Client │────>│ Feed Service │
└──────────┘ └──────┬───────┘
│
┌────────────┼────────────────┐
│ │ │
┌────────v──┐ ┌─────v─────┐ ┌───────v──────┐
│ Redis │ │ Celebrity │ │ Post │
│ Feed │ │ Post │ │ Service │
│ Cache │ │ Fetch │ │ (hydrate) │
│ (IDs) │ │ (on read) │ │ │
└───────────┘ └───────────┘ └──────────────┘
│ │ │
└────────────┼────────────────┘
│ merge + sort
v
┌──────────────┐
│ Merged Feed │
│ (hydrated) │
└──────────────┘
Fan-Out on Write vs. Read vs. Hybrid
Fan-out on write (push model)
When a user creates a post, the system writes the post ID to every follower's feed list in Redis.
User A posts (A has 200 followers):
LPUSH user:B:feed post_id_123
LPUSH user:C:feed post_id_123
LPUSH user:D:feed post_id_123
... (200 times)
LTRIM user:B:feed 0 499 # keep only latest 500
Pros: Feed reads are instant (just LRANGE on Redis). Post visibility is limited only by fan-out time. Reads are O(1) -- fetch a pre-computed list.
Cons: Celebrity problem. Write amplification (1 post = N writes). Wasted writes for inactive users (you push to feeds that never get read).
Fan-out on read (pull model)
When a user opens their feed, the system queries the database for recent posts from all followed accounts and merges them.
User B opens feed (B follows 200 accounts):
For each followed account:
recent_posts = cache.get("user:{followed_id}:posts") or db.query(...)
Merge and sort by timestamp
Return top 20
Pros: No write amplification. No wasted computation for inactive users. Celebrity posts are no different from normal posts.
Cons: Feed reads are slow (must fetch and merge from 200+ sources). Read latency scales with number of followed accounts.
The hybrid approach (what everyone uses)
Celebrity threshold: 500,000 followers (configurable)
Post created by user with < 500K followers:
-> Fan-out on write to all followers' feeds
Post created by user with >= 500K followers:
-> Do NOT fan-out
-> Store in celebrity posts table
-> Followers fetch celebrity posts at feed-read time
When User B reads their feed: 1. Fetch pre-computed feed from Redis (posts from non-celebrity followees): O(1) 2. Fetch recent posts from celebrities B follows (maybe 5-10 celebrities): O(K) where K is number of celebrities followed 3. Merge both lists by timestamp 4. Return top 20
Celebrity posts at read time is fast because each user follows only a handful of celebrities (5-20 typically). Fetching 20 post lists and merging them is trivial compared to fetching 200 post lists.
The Celebrity Problem in Detail
Why 500M followers makes fan-out on write impossible:
Lady Gaga posts a tweet:
Fan-out to 500M followers at 100K writes/sec = 5,000 seconds = 83 minutes
During those 83 minutes:
- Gaga's followers do not see the post (feed is stale)
- The write pipeline is saturated, delaying fan-out for ALL other posts
- If Gaga tweets again before the first fan-out finishes, you now have
2 * 500M = 1B pending writes
- If 5 celebrities tweet in an hour: 5 * 500M = 2.5B pending writes
The threshold between "normal user" and "celebrity" is a business decision, not a technical one. Large social platforms typically set a threshold around 500K-1M followers for switching from fan-out-on-write to fan-out-on-read. Below this threshold, fan-out on write completes in under 10 seconds (1M followers / 100K writes/sec). Above it, the fan-out time is unacceptable.
Inactive user optimization: Even for non-celebrity accounts, fanning out to inactive users is wasteful. If User B has not opened the app in 6 months, pushing post IDs to their Redis feed wastes memory and write bandwidth. Solution: only fan out to users who have been active in the last 7-14 days. When an inactive user returns, generate their feed on demand (fan-out on read for the first load), then resume fan-out on write.
Cursor-Based Pagination
Feeds must support infinite scroll. Offset-based pagination (LIMIT 20 OFFSET 40) breaks for two reasons:
- If new posts are inserted between page loads, items shift. The user sees duplicates or misses posts.
OFFSET 10000requires scanning and discarding 10,000 rows. Slow for deep pages.
Cursor-based pagination uses a stable pointer (typically the ID or timestamp of the last item on the current page):
GET /feed?cursor=post_id_456&limit=20
Response:
{
"posts": [...20 posts...],
"next_cursor": "post_id_436",
"has_more": true
}
The cursor is opaque to the client. Internally, it maps to:
SELECT * FROM feed_items
WHERE user_id = B AND post_id < 'post_id_456'
ORDER BY post_id DESC
LIMIT 20;
For Redis (pre-computed feeds): use LRANGE with an index derived from the cursor position. Or store feeds as sorted sets with scores equal to post timestamps, and use ZREVRANGEBYSCORE with the cursor as the max score.
Feed = Post IDs, Not Full Posts
The feed cache stores only post IDs, not full post objects. This is critical for two reasons:
-
Memory: A post object with text, media URLs, author info, and engagement counts can be 2-5 KB. A post ID is 8 bytes. Storing IDs uses 250-600x less memory.
-
Consistency: When a post is edited or deleted, you update it in one place (the posts table). If feeds stored full post copies, you would need to update every feed that contains the post -- essentially a second fan-out.
Hydration: When the feed service returns results, it fetches the full post objects for the 20 post IDs in the current page. This is a multi-get from the posts cache (Redis or Memcached):
post_ids = redis.lrange("user:B:feed", 0, 19) # 20 IDs
posts = post_cache.mget(post_ids) # hydrate
# Fill any cache misses from database
missing = [id for id, post in zip(post_ids, posts) if post is None]
db_posts = db.multi_get(missing)
# Populate cache for misses
for id, post in db_posts:
post_cache.set(f"post:{id}", post, ttl=3600)
Deep Dives

Deep Dive 1: Feed Ranking vs. Chronological
Early Twitter showed a purely chronological feed. Modern feeds use algorithmic ranking -- posts are scored and reordered based on predicted engagement.
Chronological feed: Simple. Sort by created_at DESC. Users see everything in order. Problem: if you follow 500 accounts, the feed is dominated by high-volume posters. Important posts from close friends get buried.
Ranked feed: A machine learning model scores each candidate post:
score = w1 * affinity(user, author)
+ w2 * recency(post.created_at)
+ w3 * engagement_velocity(post)
+ w4 * content_type_preference(user, post.type)
- Affinity: How often you interact with this author (likes, comments, DMs). Higher affinity = higher score.
- Recency: Exponential decay. A post from 5 minutes ago scores higher than one from 5 hours ago.
- Engagement velocity: Posts getting lots of likes/comments quickly probably have something interesting going on.
- Content type preference: If you always watch videos but never read text posts, videos score higher.
Architecture for ranking: The feed service fetches 500 candidate post IDs (from pre-computed feed + celebrity posts), scores each with the ranking model (a lightweight model that runs in < 10ms for 500 posts), sorts by score, and returns the top 20. The ranking model runs per-request, not at fan-out time, because the score depends on the reading user's preferences.
Ranking latency budget: The entire feed request must complete in 200ms. Breakdown: - Fetch post IDs from Redis: 1-2ms - Fetch celebrity posts: 5-10ms - Hydrate 500 posts from cache: 5-10ms - Score and rank 500 posts: 5-10ms - Serialize and return: 1ms - Total: ~25ms for p50, 100ms for p99. Well within budget.
Deep Dive 2: Feed Graph and Storage Layer
The social graph (who follows whom) is the backbone of the feed system. Its storage and query efficiency directly impact fan-out performance.
Adjacency list storage (in database):
CREATE TABLE follows (
follower_id BIGINT,
following_id BIGINT,
created_at TIMESTAMP,
PRIMARY KEY (follower_id, following_id),
INDEX idx_following (following_id, follower_id) -- for "who follows me?"
);
Two indices: one for "who do I follow?" (follower_id lookup) and one for "who follows me?" (following_id lookup). The second is needed for fan-out on write -- when User A posts, look up all of A's followers.
At scale (500M users, 100B follow edges): The follows table is too large for a single database. Shard by following_id (so all followers of a user are on the same shard). This makes fan-out on write efficient -- you query one shard to get all followers of the posting user.
Redis for the feed cache:
Key: user:{user_id}:feed
Value: Sorted Set (score = post_timestamp, member = post_id)
or List (LPUSH for new posts, LTRIM to cap at 500)
-- Add post to feed:
ZADD user:B:feed 1713456789 post_id_123
-- Read feed (latest 20):
ZREVRANGE user:B:feed 0 19
-- Cursor pagination (posts before timestamp T):
ZREVRANGEBYSCORE user:B:feed T -inf LIMIT 0 20
Sorted sets are better than lists for feeds because they support score-based range queries (for cursor pagination and deduplication). The score is the post's creation timestamp, which naturally orders posts chronologically.
Deep Dive 3: Handling Deletes and Edits
When a post is deleted, it must disappear from every follower's feed. With fan-out on write, the post ID is in potentially millions of Redis feed lists.
Option 1: Lazy deletion (what Facebook does)
Do not remove the post ID from feed lists. Mark the post as deleted in the posts table. When a user's feed is hydrated, filter out deleted posts. The feed has "holes" (deleted post IDs that return null during hydration), which the client ignores.
feed_ids = redis.zrevrange("user:B:feed", 0, 24) # fetch 25 (5 extra)
posts = post_cache.mget(feed_ids)
visible = [p for p in posts if p is not None and not p.deleted]
return visible[:20] # return first 20 visible
Over-fetch by 20-25% to account for deleted posts. Simple, no write amplification on delete.
Option 2: Active deletion (expensive but clean)
When a post is deleted, enqueue a fan-out deletion job that removes the post ID from every follower's feed. Same cost as the original fan-out -- if the user has 1M followers, you need 1M Redis operations.
I would use lazy deletion for most systems. The storage overhead of keeping deleted post IDs in feeds is negligible (8 bytes per ID), and the read-time filtering adds microseconds to feed hydration.
Alternative Designs
Alternative 1: Pure Pull (Google Buzz Model)
No pre-computation at all. On each feed load, query each followed user's recent posts, merge, and sort. Use aggressive caching of each user's post list.
Alternative 2: Pure Push (Early Twitter)
Fan-out every post to every follower's feed, including celebrities. Accept the delayed visibility for celebrity posts.
Alternative 3: Graph Database (Neo4j/TAO)
Store the social graph in a purpose-built graph database. Feed generation becomes a graph traversal: "find all posts from nodes reachable via a FOLLOWS edge within 1 hop, sorted by time." Facebook's TAO (The Associations and Objects) cache is essentially this -- a graph-aware cache layer.
| Aspect | Hybrid (Push+Pull) | Pure Pull | Pure Push | Graph Database |
|---|---|---|---|---|
| Feed read latency | 5-50ms | 100-500ms | 1-5ms | 20-100ms |
| Celebrity post delay | 0 (read at access time) | 0 | Minutes to hours | 0 |
| Write amplification | Low-Medium | None | Very High | None |
| Memory usage | 2 TB (feeds in Redis) | 500 GB (post caches) | 5+ TB (all feeds) | 1 TB (graph) |
| Implementation | Complex (two paths) | Simple | Simple | Complex (graph queries) |
| Inactive user cost | Low (skip inactive) | Zero | High (push to all) | Zero |
The hybrid approach is the right answer for any social feed at scale. Pure push worked for early Twitter (< 100M users, before large celebrity accounts). Pure pull works for small-scale platforms. Graph databases work well for friend-of-friend recommendations but are not optimized for high-throughput feed reads.
Scaling Math Verification
Feed writes (fan-out on write for non-celebrities):
- 5,800 posts/sec * 200 average followers = 1.16M Redis writes/sec
- Exclude inactive users (60% of followers): 1.16M * 0.4 = 464K writes/sec
- Redis cluster with 10 shards: 46K writes/sec per shard. Well within 100K+/shard capacity.
Feed reads:
- 46K feed reads/sec
- Each read: ZREVRANGE + 20 post hydrations (multi-get)
- Redis reads: 46K + 46K * 20 = 966K ops/sec. With 10 Redis shards: 97K ops/sec per shard. Fine.
Celebrity read path:
- Average user follows 5-10 celebrities
- 46K feed reads/sec * 10 celebrity post fetches = 460K celebrity post lookups/sec
- Celebrity posts cached in Redis (separate cluster): 460K/sec across 5 shards = 92K/shard. Fine.
Memory:
- 200M active users * 500 post IDs * 8 bytes = 800 GB for feed cache
- Post cache (recent 10M posts * 2 KB) = 20 GB
- Total Redis memory: ~820 GB across clusters. Large but within reason for Redis Cluster.
Failure Analysis
| Component | Current capacity | At 10x (50K posts/sec) | Breaks? | Fix |
|---|---|---|---|---|
| Fan-out writes | 464K Redis writes/sec | 4.64M writes/sec | Yes | More shards, increase celebrity threshold |
| Redis feed cache | 820 GB | 8.2 TB | Yes | Tiered storage (hot feeds in Redis, cold on SSD) |
| Feed reads | 966K ops/sec | 9.66M ops/sec | Yes | More shards, L1 local cache on feed servers |
| Post hydration cache | 20 GB | 200 GB | No | Larger Redis cluster |
| Celebrity post reads | 460K/sec | 4.6M/sec | Yes | Pre-compute celebrity feeds per time bucket |
| Social graph DB | 100B edges | 1T edges | Yes | Shard by user ID, graph database (TAO-style) |
The first thing to break at 10x is Redis feed cache memory (8.2 TB). The fix is tiered storage: keep feeds for users active in the last 24 hours in Redis (hot tier), and store older/inactive feeds on SSD-backed cache (warm tier). When an inactive user returns, promote their feed from warm to hot tier.
The second bottleneck is fan-out write throughput. At 4.64M writes/sec, you need 50+ Redis shards. Alternatively, raise the celebrity threshold from 500K to 100K followers, which reduces the number of users getting fan-out on write and shifts more work to the read path.
What's Expected at Each Level
| Aspect | Mid-Level | Senior | Staff+ |
|---|---|---|---|
| Fan-out strategy | Knows push or pull, picks one | Explains push vs pull trade-offs, picks push | Hybrid with celebrity threshold, explains why and where boundary is |
| Celebrity problem | Not mentioned | "Big accounts slow down writes" | Quantifies: 500M followers = 83 min fan-out, explains fix |
| Pagination | Offset-based | Cursor-based, explains why offsets break | Cursor with sorted set scores, handles deletes in cursor window |
| Storage model | "Store posts in feed" | "Store post IDs, hydrate on read" | Memory math, tiered storage, inactive user optimization |
| Feed ranking | Chronological only | Mentions algorithmic ranking | Ranking model architecture, latency budget, scoring features |
| Consistency | Not discussed | Eventual consistency acceptable | Lazy deletion, follow/unfollow backfill strategies |
| Caching | "Cache feeds in Redis" | Sorted set with ZREVRANGE | Multi-tier (Redis feeds + post cache + L1 local), warming strategy |
| Real-world reference | "Like Twitter" | Mentions fan-out on write | Twitter's hybrid approach, Facebook's TAO, inactive user skip |
The single most important signal at any level: do you immediately identify that the feed system has two fundamentally different workloads (normal users and celebrities) that require different strategies? Trying to force one approach on both is the classic mistake. The hybrid design is not a compromise -- it is the only architecture that works.
References from Our Courses
- Redis Data Structures and Use Cases — sorted sets for precomputed user timeline caches
- Kafka Partitions and Ordering — fan-out event distribution to follower feeds
- Partition and Clustering Keys — Cassandra model for time-sorted feed storage
Red Team This Design
Ready to stress-test this architecture? The Attack companion tears apart every decision in this design — from hardware physics to security holes to what actually happens at 10x scale.