Learn→Recommendation Systems→Real-time Personalization (Session-based, Contextual Bandits)→2 of 6

Recommendation Systems • Real-time Personalization (Session-based, Contextual Bandits)Medium⏱️ ~3 min

How Session-Based Models Work

SESSION AS A SEQUENCE
A session is a sequence of user actions: page views, clicks, searches, add-to-cart events. Session-based models treat this sequence like a sentence and predict what comes next. If the last 5 actions were viewing laptop, laptop case, laptop stand, mouse, keyboard, the model predicts accessories or peripherals. The key insight is that recent actions reveal current intent better than lifetime purchase history.
ARCHITECTURE PATTERNS
Common architectures use recurrent neural networks or transformers to encode the action sequence. Each action becomes a vector (embedding), and the model processes these vectors in order to produce a session embedding representing current intent. This session embedding is compared against item embeddings to rank candidates. Inference happens on every new action, adding 10 to 30ms latency per request.
FEATURE ENGINEERING
Beyond the action sequence, models incorporate context features: time since last action (users who pause 5 minutes might be comparing prices), action type weights (purchases signal stronger than views), recency decay (actions 2 minutes ago matter more than 20 minutes ago), and category patterns (3 electronics views in a row versus scattered browsing).
⚠️ Trade-off: Longer sequence context improves accuracy but increases latency and memory. Most systems use the last 20 to 50 actions as a practical limit.
COMBINING WITH HISTORICAL PROFILES
Production systems blend session signals with long term preferences. A typical approach weights them: score = 0.6 × session_score + 0.4 × historical_score. Early in a session (few actions), historical dominates. As session grows, session signal takes over.

💡 Key Takeaways

✓Sessions are action sequences; models predict next item like predicting the next word in a sentence

✓RNNs or transformers encode actions into session embeddings compared against item embeddings

✓Context features: time between actions, action types, recency decay, category patterns

✓Sequence length limited to 20-50 actions for latency and memory reasons

✓Blend session and historical scores: 0.6 × session + 0.4 × historical, shifting as session grows

📌 Interview Tips

1Walk through a sequence: laptop → case → stand → mouse → keyboard predicts peripherals

2Explain the latency budget: 10-30ms per inference on every new action

3Discuss early vs late session: historical dominates with 2 actions, session dominates with 15

← Back to Real-time Personalization (Session-based, Contextual Bandits) Overview