Learn→ML-Powered Search & Ranking→Evaluation (NDCG, MRR, CTR, Dwell Time)→3 of 6

ML-Powered Search & Ranking • Evaluation (NDCG, MRR, CTR, Dwell Time)Medium⏱️ ~3 min

MRR and Precision@K: When You Care About the First Correct Result

Core Concept
MRR (Mean Reciprocal Rank) measures where the first relevant result appears. Precision@K measures what fraction of top-K results are relevant. Both use binary relevance.
MRR: When Users Want One Answer
Reciprocal Rank is 1 divided by the position of the first relevant result. First relevant at position 1: RR = 1.0. Position 3: RR = 0.33. Position 10: RR = 0.1. No relevant in top-K: RR = 0. MRR averages this across queries. MRR of 0.5 means first relevant at position 2 on average.
Use MRR for navigational queries ("facebook login") where users want exactly one answer. Position 1 versus 2 matters enormously; position 5 versus 6 barely matters. MRR captures this through the 1/position formula.
Precision@K: What Fraction of Results Are Good
Precision@K = relevant items in top-K divided by K. If top-10 has 6 relevant items, Precision@10 = 0.6. Position within top-K does not matter: [relevant, relevant, irrelevant] and [irrelevant, relevant, relevant] both score 0.67.
Use Precision@K when users scan multiple results: image search, product listings. High Precision@10 means mostly relevant items without scrolling past garbage.
Choosing Between MRR, Precision, and NDCG
MRR: Single answer matters (navigational search, QA). Precision@K: Multiple results matter equally (product grid). NDCG: Multiple results with different quality levels. In practice, teams track multiple: NDCG for overall quality, MRR for navigational, Precision for coverage.
⚠️ Trade-off: MRR and Precision use binary relevance, ignoring quality gradations. If distinguishing "somewhat" from "highly" relevant matters, use NDCG.

💡 Key Takeaways

✓MRR = 1/position of first relevant, averaged. MRR 0.5 means first relevant at position 2 on average.

✓Use MRR for navigational queries where users want one answer. Position 1 vs 2 matters greatly.

✓Precision@K = relevant in top-K / K. Position within K does not matter, only count.

✓Use Precision@K for multiple equally-relevant results: product grids, image galleries.

✓Binary limitation: MRR and Precision cannot distinguish somewhat from highly relevant.

📌 Interview Tips

1Walk through MRR: position 1 = 1.0, position 3 = 0.33, position 10 = 0.1. Average across queries.

2Choose metric by use case: MRR for single-answer, Precision for equally relevant results, NDCG for graded.

3Note binary limitation: these metrics treat all relevant items the same regardless of quality.

← Back to Evaluation (NDCG, MRR, CTR, Dwell Time) Overview