How Position Bias Distorts Training Data

EXAMINATION VERSUS RELEVANCE
A click requires two things: the user must see the item (examination) and the item must be appealing (relevance). Position affects examination probability but not relevance. Position 1 has perhaps 95% examination probability; position 10 has 20%. If both positions have 10% clicks, the item at position 10 is actually much more relevant because it converted 50% of those who saw it (10% / 20%) versus 10.5% at position 1 (10% / 95%).
MEASURING POSITION EFFECT
To measure position bias, run randomization experiments. Show the same item in different positions to different users and measure click rates. You will find a curve like: position 1 baseline, position 2 is 70% of position 1, position 3 is 50%, position 5 is 25%, position 10 is 10%. This curve is your position bias model. The exact shape varies by product (search results, feeds, grids) but the pattern is universal.
SELECTION BIAS COMPOUNDS THE PROBLEM
Selection bias means users with certain preferences are more likely to see certain items. If sports fans mostly see sports content at the top (because past models learned to show it), their clicks train the model that sports content is universally popular. But it is only popular because sports fans were over represented in the training data. Selection and position bias together create severely distorted models.
⚠️ Warning: You cannot separate position from relevance using observational data alone. You must run randomized experiments to measure the position effect.
DATA COLLECTION STRATEGY
Log both the position shown and the probability of showing in that position (propensity). Without propensity, you cannot correct for bias later. Standard format: user, item, position, propensity score, action (click or not), timestamp.

💡 Key Takeaways

✓Click = examination × relevance; position affects examination (95% at pos 1, 20% at pos 10) not relevance

✓Same 10% CTR means 50% relevance at position 10 vs 10.5% at position 1 - huge difference

✓Position curves: pos 2 is 70% of pos 1, pos 3 is 50%, pos 5 is 25%, pos 10 is 10%

✓Selection bias compounds: sports fans see sports at top, model learns sports is universally popular

✓Must log position and propensity for every impression to enable bias correction

📌 Interview Tips

1Walk through the math: 10% CTR at position with 20% examination = 50% true relevance

2Describe randomization experiment: same item, random positions, measure CTR curve

3Explain logging requirements: user, item, position, propensity, click, timestamp

← Back to Position Bias & Feedback Loops Overview