How Collaborative Filtering Works
Computing User Similarity
Take two users and look at items they both rated. If both rated 20 items in common, compute how correlated their ratings are. Common measures: Pearson correlation (how much ratings move together), cosine similarity (angle between rating vectors), or Jaccard similarity (overlap in items rated positively).
Pearson correlation is popular because it handles rating bias. If user A rates everything 1 point higher than user B on average, Pearson still detects they have similar tastes because it measures correlation, not absolute agreement. Cosine similarity is faster to compute and works well when ratings are normalized.
Making Predictions
To predict user U rating for item I: (1) Find the K most similar users to U who have rated item I. (2) Take a weighted average of their ratings, weighted by similarity. If the 5 most similar users rated the item 4, 5, 4, 5, 3 and their similarity scores are 0.9, 0.8, 0.85, 0.75, 0.7, the prediction is (4*0.9 + 5*0.8 + 4*0.85 + 5*0.75 + 3*0.7) / (0.9+0.8+0.85+0.75+0.7) = 4.19.
Choosing K matters. Too small (K=5) makes predictions noisy because a single unusual neighbor dominates. Too large (K=100) dilutes signal with weakly similar users. Most systems use K=20-50. Production systems tune K on held-out validation data.
Item-Based Alternative
Instead of finding similar users, find similar items. To predict user U rating for item I: (1) Find items similar to I that U has already rated. (2) Weighted average of those ratings by item similarity. If U rated items J, K, L with ratings 5, 4, 5, and their similarities to I are 0.85, 0.6, 0.9, prediction = (5*0.85 + 4*0.6 + 5*0.9) / (0.85+0.6+0.9) = 4.68.
Item similarities are precomputed offline. This makes prediction fast: just look up precomputed similarities and do weighted averaging. User-based requires computing similarities at request time or maintaining a constantly updating cache. For catalogs with millions of items, item-based is typically more practical.