ML-Powered Search & RankingReal-time Search PersonalizationMedium⏱️ ~3 min

Session Feature Computation: Real-Time Updates Within Latency Constraints

Core Challenge
Session features must update within milliseconds of user actions and be available for the next search request. This requires streaming computation and storage optimized for both high write throughput and low-latency reads.

What Session Features Capture

Session features summarize in-session behavior: clicked item IDs (last 10-20), clicked categories, query history, dwell time on viewed items, add-to-cart events. These are aggregated into feature vectors: session_category_dist = [0.4, 0.3, 0.2, 0.1] showing interest distribution across top categories. Also computed: session embeddings (average of clicked item embeddings), recency-weighted click scores, and cross features (query-click similarity).

Update Path: Click to Feature

When user clicks item X: (1) Click event hits event stream (Kafka or similar) with sub-100ms latency. (2) Stream processor updates session state: appends X to click list, recomputes session embedding, updates category distribution. (3) Updated session stored in low-latency key-value store keyed by session ID. (4) Next search request fetches session features in 1-3ms. Total time from click to searchable feature: 100-500ms. The user's next search (typically 2-10 seconds later) sees updated personalization.

Latency Budget Breakdown

For a 200ms end-to-end search: retrieval gets 50ms, ranking gets 80ms, personalization gets 20-30ms, network overhead 40ms. Within personalization: session feature fetch 3ms, long-term profile fetch 3ms, feature combination 5ms, score adjustment 10ms. Every component must stay within budget. If session store latency spikes to 50ms, personalization blows its budget and either times out (skipping personalization) or delays the entire response.

Storage Trade-offs

Session data is ephemeral (expires after 30 min to 24 hours) but must be fast. In-memory stores provide <1ms reads but lose data on restart. Distributed key-value stores provide durability but add 2-5ms latency. Hybrid approach: write-through to both; serve from memory when available, fall back to persistent store. Session loss is acceptable (user gets un-personalized results for one search) but shouldn't happen frequently.

💡 Key Takeaways
Session features: last 10-20 clicked items, category distribution, session embedding (average of click embeddings)
Update path: click → event stream → stream processor → session store in 100-500ms total
Latency budget: personalization gets 20-30ms within 200ms search; session fetch must be 3ms or less
Storage trade-off: in-memory (<1ms but volatile) vs distributed KV (2-5ms but durable); hybrid approach common
Session loss is acceptable (one un-personalized search) but latency spikes break the entire search budget
📌 Interview Tips
1Walk through the update path: click → event stream (100ms) → processor → store → next search sees it
2Give the latency breakdown: retrieval 50ms, ranking 80ms, personalization 20-30ms, network 40ms = 200ms total
3Explain session embedding: average of item embeddings for last N clicked items
← Back to Real-time Search Personalization Overview