Recommendation Systems • Cold Start ProblemMedium⏱️ ~3 min
Progressive Profiling and Identity Resolution for User Cold Start
User cold start is accelerated by collecting lightweight preference signals during onboarding (progressive profiling) and linking fragmented sessions across devices into a unified identity graph (identity resolution). The goal is to reduce time to first good recommendation from days to minutes while minimizing friction that hurts signup conversion.
Progressive profiling asks new users to provide a small number of seed preferences through low friction interactions rather than lengthy surveys. Spotify prompts users to tap 3 to 5 favorite artists or genres, immediately generating playlists via content embeddings before any listening occurs. Netflix might show a grid of popular titles and ask users to like or dislike 5 to 10 to initialize taste clusters. The key trade-off is onboarding friction versus personalization speed: asking for 10 preferences can reduce signup completion by 10 to 20%, but it cuts time to useful recommendations from multiple sessions to under 60 seconds. Optimal designs use one tap interactions (thumbs up/down, single choice from visual grid) and adaptively query only when uncertainty is high.
Identity resolution tackles the problem of fragmented user histories across devices, browsers, and logged out sessions. Without linking, a user who browses on mobile, then desktop, then switches browsers appears as three cold start users, each receiving suboptimal recommendations. Systems build identity graphs using deterministic keys (login email, account ID) plus privacy compliant probabilistic signals (device fingerprints, behavioral patterns, IP address). A robust graph maintains unified profiles with recency weighted features, deduplicates exposures (don't show the same item twice across devices), and enables correct attribution for conversions and engagement.
Session based models complement identity graphs by personalizing within a single session using only recent behavior. After just 2 to 3 interactions (searches, clicks, plays), the system updates a short term intent model to adapt recommendations in real time. This is critical for anonymous users who cannot be linked: a logged out Amazon browser sees initial priors, but after viewing three products in the camping category, subsequent recommendations shift to outdoor gear within the same session. These models typically use recurrent neural networks or transformers over the last 10 to 20 actions, with inference latency budgets under 20 to 50ms.
💡 Key Takeaways
•Progressive profiling uses lightweight onboarding interactions (selecting 3 to 5 artists, liking 5 to 10 titles) to generate initial embeddings, reducing time to first good recommendation from days to under 60 seconds
•Onboarding friction trade-off: asking for 10 preferences can drop signup completion by 10 to 20%, but dramatically improves early session quality; optimal designs use one tap choices and adaptive questioning
•Identity resolution links devices, browsers, and sessions into unified user profiles using deterministic keys (login, email) and privacy safe probabilistic signals (device fingerprints, behavioral consistency)
•Session based models personalize in real time using only the last 2 to 3 interactions, critical for anonymous users, typically implemented with RNNs or transformers over last 10 to 20 actions with sub 50ms inference
•Unified identity graphs enable deduplicated exposures across devices (don't show same item twice), correct attribution for conversions, and recency weighted feature aggregation for personalization
📌 Examples
Spotify onboarding: new user taps 5 favorite artists, system generates 3 personalized playlists via artist embedding similarity before any plays, reducing cold start from hours to seconds
Netflix taste profiling: new account rates 10 titles from visual grid during signup, initializes collaborative neighborhood within taste cluster, first homepage shows personalized rows immediately
Amazon identity graph: links mobile app session, desktop browser, and tablet using login email plus probabilistic device signals, maintains unified browsing and purchase history with recency weighting