What is Hard Negative Mining?
WHY HARD NEGATIVES MATTER
Random negative sampling is easy but uninformative. If the model learns to distinguish "Nike running shoes" from "medieval castle photos," it has not learned anything useful. These easy negatives are so different that the model solves the task without learning nuance.
Hard negatives force the model to learn subtle distinctions. "Nike running shoes" vs "Adidas running shoes" teaches brand differences. "Nike running shoes" vs "Nike basketball shoes" teaches category differences. These hard examples are where the learning happens.
THE MINING PROCESS
For each positive pair (query, relevant item), find items that are similar but should not be retrieved. Selection strategies include:
In-batch negatives: Use other positives in the batch as negatives. Simple but limited—batch may not contain truly hard examples.
ANN mining: Query the embedding index for nearest neighbors that are known negatives (from labels). Finds semantically similar items that should be distinguished.
Top-K mining: Take the top K model predictions and label ones that are wrong as hard negatives for retraining.
IMPACT ON MODEL QUALITY
Hard negative mining typically improves recall@K by 5-15% compared to random negatives. The improvement is larger when the embedding space has many near-duplicates or confusable items.