What is Real-Time Search Personalization?

Definition
Real-time search personalization adapts search results to individual users based on their recent behavior within the current session, not just their historical profile. The key difference from batch personalization: features are computed and applied within milliseconds of user actions.
Why Session Context Matters
A user searching for "python" could want the snake, the programming language, or Monty Python. Historical preferences help, but the current session tells you definitively. If they just clicked a coding tutorial, "python" means programming. If they came from a pet store page, it means snake. Real-time personalization uses these in-session signals to disambiguate intent within 10-50ms of the search.
Batch vs Real-Time Personalization
Batch personalization: Pre-computes user preferences overnight or hourly. Stores a static user profile (interests, categories, price ranges). Fast to serve but reflects who the user was hours ago, not who they are now. Real-time personalization: Updates the user's context with every click, view, and search within the session. Captures intent shifts (started browsing electronics, now looking at gifts). Requires streaming infrastructure to compute features in <50ms.
The Latency Challenge
Search has strict latency budgets: total response time under 200ms. Within that, personalization gets maybe 20-30ms. You must fetch user context, compute personalized features, blend them into the ranking score, and return results. The architecture uses pre-computed embeddings (user and item vectors stored for fast lookup) combined with real-time session features (last 5 clicks, current query). Heavy computation happens offline; real-time only does lightweight lookups and score adjustments.
💡 Key Insight: Real-time personalization doesn't replace historical profiles; it blends short-term session signals with long-term preferences. A user's lifetime preference for premium brands still matters, but their current session's budget-shopping behavior should shift results toward deals.

💡 Key Takeaways

✓Real-time personalization adapts results based on current session behavior, not just historical profiles

✓Session context disambiguates intent: "python" means different things depending on what user just clicked

✓Batch personalization reflects who user was hours ago; real-time captures intent shifts within the session

✓Strict latency budget: personalization gets 20-30ms within a 200ms total search response

✓Architecture blends pre-computed embeddings (fast lookup) with real-time session features (last N clicks)

📌 Interview Tips

1Explain the python example: same query, different intent based on session context (coding vs pet store)

2Contrast batch vs real-time: batch is hours-old static profile, real-time captures intent shifts

3Mention the latency constraint: 20-30ms for personalization within 200ms total response

← Back to Real-time Search Personalization Overview