Learn→Embeddings & Similarity Search→Hard Negative Mining (Triplet Loss, Contrastive Learning)→3 of 5

Embeddings & Similarity Search • Hard Negative Mining (Triplet Loss, Contrastive Learning)Hard⏱️ ~3 min

Online vs Offline Hard Negative Mining Architecture

OFFLINE HARD NEGATIVE MINING
Offline mining selects hard negatives before training begins. Process: embed all items using current model, query for nearest neighbors, label those that are known negatives as hard negatives. Store these pairs for training.
Advantage: can mine exhaustively across the entire corpus. Find the globally hardest negatives, not just those in a batch. Disadvantage: hard negatives become stale as model improves. What was hard at epoch 1 may be trivial at epoch 10.
Refresh strategy: re-mine hard negatives every N epochs (typically 1-5). Re-embed corpus with updated model, regenerate hard negative pairs. Adds computational overhead but keeps negatives challenging throughout training.
ONLINE HARD NEGATIVE MINING
Online mining selects hard negatives during training using the current batch. Compute embeddings for batch items, find hardest negatives within the batch for each anchor. No pre-computation needed.
In-batch negatives: Other positives in batch serve as negatives. Fast, no extra computation. Batch size limits negative diversity—512 batch = 511 candidates.
Hardest-in-batch: Select the negative with smallest distance to anchor. Can be too aggressive—may repeatedly select mislabeled positives.
Semi-hard negatives: Select negatives harder than positive but not the hardest. d(anchor, positive) < d(anchor, negative) < d(anchor, positive) + margin. Balances difficulty and stability.
CHOOSING BETWEEN THEM
Use offline mining when: you have a large corpus, can afford re-mining overhead, need globally hard negatives that batch may miss.
Use online mining when: corpus changes frequently, computational budget is limited, batch size is large enough for diversity (512+).
✅ Best Practice: Start with in-batch negatives (simplest). If recall plateaus, add offline mining with periodic refresh. Monitor for false negative issues.

💡 Key Takeaways

✓Offline mining: exhaustive across corpus but hard negatives go stale; refresh every 1-5 epochs

✓Online mining: uses batch; in-batch is simple, semi-hard balances difficulty and stability

✓Start with in-batch negatives; add offline mining if recall plateaus

📌 Interview Tips

1Interview Tip: Explain semi-hard selection—harder than positive but not hardest; avoids mislabeled positives.

2Interview Tip: Describe when to choose each—offline for large static corpus, online for changing data.

← Back to Hard Negative Mining (Triplet Loss, Contrastive Learning) Overview