Recommendation SystemsReal-time Personalization (Session-based, Contextual Bandits)Medium⏱️ ~3 min

How Session-Based Models Work

SESSION AS A SEQUENCE

A session is a sequence of user actions: page views, clicks, searches, add-to-cart events. Session-based models treat this sequence like a sentence and predict what comes next. If the last 5 actions were viewing laptop, laptop case, laptop stand, mouse, keyboard, the model predicts accessories or peripherals. The key insight is that recent actions reveal current intent better than lifetime purchase history.

ARCHITECTURE PATTERNS

Common architectures use recurrent neural networks or transformers to encode the action sequence. Each action becomes a vector (embedding), and the model processes these vectors in order to produce a session embedding representing current intent. This session embedding is compared against item embeddings to rank candidates. Inference happens on every new action, adding 10 to 30ms latency per request.

FEATURE ENGINEERING

Beyond the action sequence, models incorporate context features: time since last action (users who pause 5 minutes might be comparing prices), action type weights (purchases signal stronger than views), recency decay (actions 2 minutes ago matter more than 20 minutes ago), and category patterns (3 electronics views in a row versus scattered browsing).

⚠️ Trade-off: Longer sequence context improves accuracy but increases latency and memory. Most systems use the last 20 to 50 actions as a practical limit.

COMBINING WITH HISTORICAL PROFILES

Production systems blend session signals with long term preferences. A typical approach weights them: score = 0.6 × session_score + 0.4 × historical_score. Early in a session (few actions), historical dominates. As session grows, session signal takes over.

💡 Key Takeaways
Sessions are action sequences; models predict next item like predicting the next word in a sentence
RNNs or transformers encode actions into session embeddings compared against item embeddings
Context features: time between actions, action types, recency decay, category patterns
Sequence length limited to 20-50 actions for latency and memory reasons
Blend session and historical scores: 0.6 × session + 0.4 × historical, shifting as session grows
📌 Interview Tips
1Walk through a sequence: laptop → case → stand → mouse → keyboard predicts peripherals
2Explain the latency budget: 10-30ms per inference on every new action
3Discuss early vs late session: historical dominates with 2 actions, session dominates with 15
← Back to Real-time Personalization (Session-based, Contextual Bandits) Overview