Trade-offs and When to Use Two-Tower
Two-Tower Wins When
Catalog exceeds 100K items: At this scale, scoring every item per request becomes impractical. Two-tower with ANN search is the only way to retrieve from millions of items in milliseconds. Matrix factorization cannot scale to this candidate pool size for real-time retrieval.
You need content features: Matrix factorization only learns from interactions, so new items with zero history get random embeddings. Two-tower item towers can use content features (title, category, images) to embed new items meaningfully from day one. Cold start is less severe.
Latency is critical: If you need results in under 50ms, the two-tower architecture shines. User embedding (5ms) plus ANN search (10ms) beats any architecture that must score user-item pairs jointly.
Two-Tower Loses When
Cross-features matter: The fundamental limitation: user and item towers never see each other. You cannot learn "this user prefers items priced 20% below their historical average" because that requires knowing both user history and item price simultaneously. If cross-features drive your business value, two-tower retrieval must feed into a ranking model that can capture these interactions.
Catalog is small: With under 10K items, you can score all items per request using a neural collaborative filtering model. The added complexity of ANN indexes and separate towers provides no benefit. A simple dot-product model scores 10K items in 1ms.
Interactions are sparse: Two-tower models need substantial training data. With fewer than 100K interactions, simpler models like matrix factorization or nearest-neighbor baselines often outperform. The neural towers lack enough signal to learn meaningful embeddings.
The Hybrid Pattern
Most production systems use two-tower for retrieval (find 1000 candidates from 10M items) then a ranking model for final ordering. The ranking model sees both user features and item features together and can learn cross-features. It only scores 1000 items, so it can be slower and more complex.