Recommendation SystemsEvaluation Metrics (Precision@K, NDCG, Coverage)Medium⏱️ ~2 min

Choosing Precision@K vs NDCG@K: When to Use Each

Core Concept
Coverage metrics measure how much of your catalog gets recommended. High engagement concentrated on a narrow subset is not success if 80% of inventory never gets exposure. Coverage balances personalization with catalog health.

Types of Coverage

Catalog coverage: Percentage of items that received at least one impression in a time period. If 10M items exist and 2M got impressions this week, coverage = 20%. Target depends on business: e-commerce wants 50-80% (inventory turnover), streaming content might accept 30-50% (long tail matters less).

User coverage: For a typical user, what fraction of relevant catalog do they see over time? Low user coverage creates filter bubbles where users only see a narrow slice. Track average categories exposed per user session.

Coverage vs Relevance Trade-off

Maximizing relevance typically hurts coverage. The safest predictions are popular items everyone likes, creating a rich-get-richer loop. Long-tail items with few interactions have uncertain relevance, so conservative models avoid them.

Set explicit coverage targets as constraints. Example: "Maintain 60% weekly catalog coverage while maximizing NDCG." Optimization must balance both. Without coverage constraints, models converge to showing the same 1000 popular items to everyone.

⚠️ Interview Pattern: When discussing recommendation metrics, mention coverage alongside Precision and NDCG. This shows systems-level thinking: you understand business health (inventory turnover, new item discovery) not just model accuracy. Ask: "What percentage of catalog gets recommended weekly?" Interviewers notice this maturity.
💡 Key Takeaways
Binary labels with small K (5-10 items visible): Precision@K is intuitive and directly measures relevant fraction in viewport.
Graded labels and scrolling UI (10-50 items): NDCG@K captures position effects and engagement intensity across the full list.
Coverage metrics serve as guardrails against concentration: catalog coverage, creator/seller coverage, category distribution.
K value should match viewport: mobile feeds show 5-8 items, desktop grids show 12-20; K beyond visible area has diminishing relevance.
Multi-objective optimization: balance Precision/NDCG (relevance) with Coverage (diversity) using constraints or weighted objectives.
📌 Interview Tips
1When asked which metric to use: explain Precision@K for binary outcomes (buy/not buy), NDCG for ranked lists with graded relevance, Coverage as a guardrail against concentration.
2For K value selection: match to viewport - mobile feeds show 5-8 items, desktop grids show 12-20; K beyond visible area has diminishing practical relevance.
3When discussing trade-offs: explain that optimizing NDCG aggressively may hurt Coverage; multi-objective optimization or guardrail constraints are common solutions.
← Back to Evaluation Metrics (Precision@K, NDCG, Coverage) Overview