Position Bias vs Selection Bias: Understanding the Difference

Position bias and selection bias are related but distinct problems in recommendation systems. Position bias specifically refers to the confounding effect of display position on engagement metrics: the same item gets different CTRs purely based on where it appears. Selection bias is the broader problem where your training data comes from a biased sample of all possible outcomes, preventing you from learning the true underlying distribution.

Selection bias manifests clearly in fraud detection systems like those at Stripe. When high risk transactions are blocked, you never observe their true ground truth labels (whether they were actually fraudulent). Your model only trains on "hard negatives" that slipped through and got labeled, creating a biased view that systematically underestimates certain fraud patterns. To combat this, production systems allow a small counterfactual bucket (typically 0.1% to 1% of traffic) where blocks are overridden to reveal true positive and false positive rates.

In recommendations, both problems interact. Position bias causes selection bias in your training data: you primarily observe engagement on items that were ranked high historically. Post click modeling helps with position bias specifically by conditioning on the user's deliberate selection (they clicked), which reduces but doesn't eliminate position confounding. However, training only on clicked items introduces selection bias because you're ignoring the much larger set of non clicked candidates.

The key distinction: position bias is about confounding from display location; selection bias is about missing or censored observations. Solutions overlap but differ. For position bias you need visibility modeling and position aware calibration. For selection bias you need exploration, propensity weighting, and careful negative sampling to observe counterfactuals your policy wouldn't naturally produce.

💡 Key Takeaways

•Position bias is confounding from display location affecting engagement. Selection bias is training only on a non representative sample of all possible outcomes.

•Stripe fraud systems demonstrate selection bias clearly: blocked transactions never reveal ground truth, requiring 0.1% to 1% override buckets for counterfactual measurement.

•Post click labels (purchases, watch time) reduce position bias by conditioning on deliberate user choice, but introduce selection bias by ignoring non clicked candidates entirely.

•Both problems interact in recommendations: position bias creates selection bias because your logs primarily contain engagement from historically top ranked items.

•Solutions differ slightly: position bias needs visibility modeling and per position calibration; selection bias needs exploration budgets and propensity weighting to observe counterfactuals.

•In advertising, selection bias appears when you only observe conversions for clicked ads, never learning about unconverted users who weren't shown certain ads at all.

📌 Examples

Fraud detection: Model trained only on 99.9% of transactions that weren't blocked learns biased patterns. Allowing 0.1% override bucket reveals that 15% of blocked transactions would have been legitimate, enabling better calibration.

YouTube recommendations: Training only on watched videos (post click) ignores 95% of impressed but not clicked videos. Without negative sampling or position correction, model can't distinguish "not relevant" from "relevant but shown at position 20".

A/B testing with guardrails: If you automatically stop experiments that hurt metrics by more than 2%, you never observe the full distribution of treatment effects, biasing meta analysis toward neutral or positive results.

← Back to Position Bias & Feedback Loops Overview