Data Processing Patterns • Newsfeed/Timeline Generation (Fan-out Patterns)Medium⏱️ ~2 min
Ranking, Personalization, and Merge Strategies
Separating delivery from ranking is foundational in production timeline systems. The delivery layer (push or pull) provides a rolling window of candidate post IDs, typically 500 to 800 items per user. Ranking runs on top of this delivered set, applying filters (privacy, mutes, language), scoring by recency and affinity, and running Machine Learning (ML) models to predict engagement. Final output is a ranked page of 20 to 50 posts optimized for relevance rather than strict chronological order.
Merge strategies differ by pattern. In push systems, ranking operates on a precomputed list fetched from cache in 1 millisecond; the ranking step adds 10 to 50 milliseconds for scoring and reordering. In pull systems, merge happens after fetching recent posts from each followee: the system must interleave and deduplicate streams, apply filters, then rank. Hybrid systems merge precomputed (pushed) content with celebrity (pulled) content before ranking, requiring stable cursors that encode position across both sources to enable pagination without duplicates or skips.
Deeper personalization increases read latency. Expanding candidate sets from 500 to 5,000 items or adding feature rich ML models can push latency beyond 2 second SLOs. Production systems balance depth and speed by prefiltering with lightweight heuristics (e.g., remove posts older than 7 days, apply block and mute lists), then running heavier ML scoring on a reduced set. Diversity rules interleave sources (friends, topics, ads) and prevent over representation from any single author.
Cursor stability is critical for pagination. Opaque cursors encode a watermark (timestamp plus post ID) that marks the user's position in the ranked feed. Concurrent inserts (new posts arriving while user scrolls) must not cause duplicates or gaps. Idempotent merge logic and monotonic sequence numbers per user timeline ensure that cursors remain valid even under eventual consistency and retries.
💡 Key Takeaways
•Delivery provides candidate set of 500 to 800 post IDs; ranking layer filters, scores, and reorders on top; separation allows independent optimization of delivery speed and ranking depth
•Prefiltering with lightweight heuristics (privacy checks, mute lists, recency cutoffs) reduces candidate set before expensive Machine Learning (ML) scoring; cuts ranking latency from 200 ms to under 50 ms
•Merge complexity in hybrid systems: interleave precomputed pushed content with pulled celebrity posts, rank combined set, and maintain stable pagination cursors encoding timestamp plus post ID watermark
•Deeper personalization tradeoff: expanding candidate set from 500 to 5,000 or adding feature rich models improves relevance but can exceed 2 second Service Level Objective (SLO); production systems limit candidate size and model complexity
•Diversity rules enforce interleaving: prevent single author dominating feed, mix friend posts with topic suggestions and ads, apply fairness constraints to improve user experience
•Cursor stability requires idempotent merge and monotonic sequence numbers; concurrent inserts during pagination must not produce duplicates or skip posts under eventual consistency
📌 Examples
Facebook Multifeed prefilters 10,000 candidates down to 500 using recency and privacy checks in 20 ms, then applies ML ranking with affinity scores in 80 ms, returning top 50 posts within 2 second SLO
Hybrid system merges 400 pushed post IDs with 50 pulled celebrity post IDs, applies unified ranking scoring all 450, and returns top 30; opaque cursor encodes (timestamp: 1678901234, postID: 78901) for stable pagination
User scrolls feed and requests next page; cursor (timestamp: 1678900000, postID: 78800) is passed to ranking service, which fetches candidates newer than cursor, scores, and returns next 20 posts without duplicates
Production system limits candidate expansion to 1,000 items per user to stay under 50 ms ranking budget; users following 5,000 accounts see top 1,000 recent posts only, trading completeness for latency