Embedding Based Similarity Features: EmbClickSim and EmbSkipSim
EmbClickSim: Similarity to Clicked Items
For each candidate item, compute similarity between its vector and vectors of items the user clicked. Similarity is measured by cosine: how much the vectors point in the same direction (1.0 = identical direction, 0 = unrelated). Formula: EmbClickSim = max(similarity to each clicked item). High EmbClickSim means candidate resembles liked items. User clicked hiking boots → trail shoes, hiking poles score high.
EmbSkipSim: Similarity to Skipped Items
Skipped items (shown but not clicked) indicate negative preference. EmbSkipSim measures similarity to avoided items. High EmbSkipSim is negative: candidate resembles things user rejected. If user saw sandals and didn't click, similar sandals should score lower. Helps avoid showing more of what user already passed over.
Combining the Signals
The ranker combines both: boost = w1 × EmbClickSim - w2 × EmbSkipSim. Typical weights: w1 = 0.6-0.8 (clicks are strong positive), w2 = 0.2-0.4 (skips are weaker negative since users skip for many reasons). Items similar to clicks but dissimilar from skips get strongest boost.
Implementation
Optimization: instead of comparing against each click, use session embedding (average of click vectors). One comparison instead of N. For skips, sample last 10 rather than all. Pre-compute item vectors offline; real-time only does vector lookups and similarity math.