Data Processing Patterns • Newsfeed/Timeline Generation (Fan-out Patterns)Hard⏱️ ~2 min
Hybrid Push Pull Pattern for Celebrity Hotspot Mitigation
Hybrid fanout combines push and pull by maintaining a per author mode flag based on follower count or write rate. Normal users (below a threshold, typically a few thousand followers) receive push based fanout where posts are written to all follower timelines immediately. Celebrity accounts exceeding the threshold are excluded from push; their posts are written only to an author specific feed and pulled at read time, then merged with the user's precomputed timeline. This prevents write storms while preserving fast cached reads for the majority.
The threshold acts as a circuit breaker. When user A crosses 5,000 followers, the system flips A's fanout mode to pull only. A's next post writes once to A's author feed instead of fanning out to 5,000+ caches. When follower B requests their home feed, the read path detects that B follows celebrity A, fetches A's recent posts from the author feed, merges them with B's precomputed (pushed) content from non celebrity followees, ranks the combined set, and returns the result. This trades a small read latency increase (a few extra fetches and a merge step) for massive write cost savings.
Production systems like Twitter style platforms report handling 6,000 writes per second and 300,000 reads per second using this hybrid approach. Without the hybrid, a single celebrity post to 10 million followers would generate 10 million fanout writes, overwhelming queues and violating the 5 second propagation SLA. With hybrid, that same post triggers one write and 10 million eventual reads distribute the merge cost over time and across the read fleet.
Key challenges include dynamic threshold tuning (too low wastes push benefits, too high allows hotspots), merge complexity (maintaining stable cursors across pushed and pulled sources), and consistency (celebrity posts may appear slightly delayed compared to pushed content). Monitoring follower count distributions and write rate histograms guides threshold placement; typically the top 1% of accounts by follower count are switched to pull.
💡 Key Takeaways
•Dynamic threshold commonly set at a few thousand followers; accounts exceeding threshold switch from push to pull, preventing millions of writes per post while preserving fast reads for 99% of users
•Write cost reduction: celebrity post to 10 million followers drops from 10 million fanout writes to 1 author feed write; read cost increases by merge latency (typically 10 to 50 ms)
•Merge complexity at read time: system fetches precomputed pushed timeline plus recent posts from each celebrity followee, ranks combined set, and maintains stable pagination cursors across sources
•Threshold tuning tradeoff: setting too low (e.g., 1,000 followers) increases read latency unnecessarily; too high (e.g., 50,000) allows write hotspots; monitor p99 out degree and write queue depth to calibrate
•Consistency relaxation: celebrity posts may appear slightly delayed in feeds compared to pushed content due to cache timing; eventual consistency within seconds is acceptable for most social products
•Production example: Twitter style system with 6,000 writes per second and 300,000 reads per second uses hybrid to handle top 1% of accounts by follower count, avoiding queue saturation and meeting 5 second propagation SLA
📌 Examples
User with 4,000 followers posts: system applies push fanout, writing to 4,000 follower caches in under 2 seconds; followers read from cache with 1 ms latency
Celebrity with 12 million followers posts: system writes once to celebrity's author feed; when follower requests timeline, read path fetches celebrity's recent 20 posts, merges with precomputed timeline from other followees, ranks, and returns in 15 ms (10 ms merge overhead)
Threshold boundary: account grows from 4,500 to 5,100 followers; system detects threshold crossing, flips fanout mode to pull, and subsequent posts skip push entirely, saving 5,100 writes per post