Pairwise Ranking: Learning Relative Order From Item Comparisons
How Pairwise Training Works
Training data consists of pairs with known ordering: item A is more relevant than item B for this query. The model scores both items. If score(A) < score(B) when A should rank higher, the loss increases and the model adjusts to push A above B. After millions of pairs, sorting by score produces correct orderings.
Common Loss Functions
RankNet: Converts score difference to a probability (larger gap = higher confidence A beats B). Wrong predictions get penalized proportionally. LambdaRank: Weights each pair by ranking impact. Swapping positions 1 and 2 hurts more than swapping 50 and 51, so top position pairs get higher training weight. Margin loss: Requires score(A) - score(B) > margin (e.g., 0.5) when A is better. Add penalty if margin is not met.
Pair Sampling Strategies
Not all pairs teach equally. Easy pairs (highly relevant vs clearly irrelevant) add little. Hard pairs (similar relevance, confusing features) drive learning. Strategies: sample pairs where the model currently predicts wrong; weight by position importance (top-10 swaps matter more); exclude pairs where both items have the same label.