ML-Powered Search & RankingRelevance Feedback (Click Models, Position Bias)Hard⏱️ ~2 min

What Are the Critical Failure Modes in Bias Aware Ranking?

Failure Mode: Impression Logging Errors

Server side impression logging treats every item returned by the API as seen by the user. In reality, users on infinite scroll feeds rarely scroll past the first 5-10 items. If you log all 50 returned items as impressions, 40 of them become false negatives. The model learns these unseen items are irrelevant, even though users never had a chance to consider them. The fix: client side viewability tracking that only logs an impression when at least 50% of the item pixels are visible for at least 1 second.

Failure Mode: Propensity Model Staleness

Propensity estimates are computed from historical data. But user behavior changes over time. New UI layouts change how far users scroll. Mobile versus desktop has different examination patterns. Seasonal changes affect engagement. If you trained propensities on data from 3 months ago, they may no longer reflect current user behavior. A propensity curve showing position 8 at 15% examination might now be 25% after a UI redesign. Using stale propensities means your IPS weights are wrong, reintroducing the bias you tried to remove.

Failure Mode: Population Shift

Propensities are often estimated on all users, but different user segments scroll differently. Power users examine 20 positions. Casual users examine 3. If your traffic mix shifts toward casual users, average examination drops at lower positions. Models trained on power user propensities over correct for casual users (applying too high weights) and under correct for power users. Segment specific propensity estimation helps, but adds complexity and requires enough data per segment.

Warning: Bias correction that worked yesterday may fail today. Monitor propensity freshness, user segment shifts, and logging accuracy continuously.
💡 Key Takeaways
Server side logging creates false negatives by treating unseen items as negative examples. Use client side viewability tracking.
Stale propensity models fail after UI changes, device mix shifts, or seasonal behavior changes. Retrain propensities monthly.
User segment shifts break average propensities. Power users and casual users have different examination patterns.
All three failure modes are invisible in offline metrics but cause production degradation
📌 Interview Tips
1When discussing impression logging, emphasize the 50% pixels visible for 1 second standard (IAB viewability). This separates true impressions from server side returns.
2Explain propensity staleness with a concrete example: UI redesign changes position 8 from 15% to 25% examination, making old IPS weights 40% too low.
3Ask about user segments: power users examine 20 positions, casual users examine 3. Average propensities fail for both.
← Back to Relevance Feedback (Click Models, Position Bias) Overview