Recommendation SystemsCollaborative Filtering (Matrix Factorization)Hard⏱️ ~3 min

Failure Modes: Cold Start and Popularity Bias

Core Concept
Collaborative filtering has two fundamental failure modes: cold start (new users/items with no interactions) and popularity bias (head items dominate recommendations). Both stem from the same root cause: CF only learns from observed interactions.

Cold Start Problem

User cold start: New users have no interaction history. CF cannot place them in similarity space. Common mitigations: ask for explicit preferences during onboarding, use demographic features to bootstrap, or fall back to popularity-based recommendations until 5-10 interactions are collected.

Item cold start: New items have no ratings. CF cannot learn their latent factors. Mitigations: use content features to initialize item vectors, boost exposure through random exploration, or use a separate model for new items until they collect enough interactions.

Popularity Bias

Popular items appear in more training examples, so the model learns them better. Better predictions lead to more recommendations, which lead to more interactions, which reinforce the bias. Long-tail items never get enough exposure to be learned properly. This creates a rich-get-richer feedback loop.

Mitigations: popularity-weighted negative sampling during training, explicit diversity constraints in serving, and regular A/B tests comparing tail-item exposure across model versions.

⚠️ Interview Deep-Dive: "How do you solve cold start?" is guaranteed to come up. Walk through the full picture: content-based fallback for new items, onboarding for new users, exploration budget to collect data, and hybrid models that combine CF with content features.
💡 Key Takeaways
User cold start: no history = random/average vector. First-session engagement 30-50% lower. Fix: onboarding signals (3-5 interests), demographic-based initialization
Item cold start: no interactions = random vector, never shown, never gets interactions, stays cold forever. Fix: content features for initial vector + 5-10% exploration slots
Popularity bias: popular items appear in more histories, model learns to recommend them broadly, they get more interactions, become more recommended. Feedback loop
Niche items trapped: 100 interactions = connected to only 100 users. Model recommends only to very similar users. Never breaks out even if broader audience would love it
Training fix: up-weight rare items (if A has 1000x more data than B, give B 1000x weight per interaction). Serving fix: final_score = relevance - α × log(popularity)
Monitor coverage: if <20-30% of catalog gets any impressions, feedback loop is winning. Track Gini coefficient of impression distribution
📌 Interview Tips
1For capacity planning: give concrete numbers - 100M users × 64 dims × 4 bytes = 25GB user embeddings; similar for items; total memory with replicas can reach 100s of GB.
2When asked about index updates: explain daily batch rebuilds with optional hourly incremental updates for high-velocity items (new releases, trending content).
3For QPS calculations: mention that with 20+ replicas per shard, systems can handle 50K+ QPS globally while maintaining <10ms p95 ANN latency.
← Back to Collaborative Filtering (Matrix Factorization) Overview