Computer Vision SystemsObject Detection (R-CNN, YOLO, Single-stage vs Two-stage)Hard⏱️ ~3 min

Failure Modes and Edge Cases in Production Object Detection

Small Object Detection Failures

Objects smaller than 32x32 pixels are notoriously difficult to detect. At low resolution feature maps, small objects occupy just a few pixels, providing insufficient signal for reliable detection. Accuracy on small objects can be 20-30% lower than large objects.

Symptoms: High recall on large objects, dramatically lower recall on small objects. Bounding boxes on small detections are imprecise.

Mitigation: Use higher resolution input images. Add detection heads at earlier (higher resolution) feature map stages. Apply multi-scale testing at inference time.

Occlusion and Truncation

Partially visible objects confuse detectors. A person half-hidden behind a car may not be detected at all. Truncated objects at image borders often receive incorrect bounding boxes or are missed entirely.

Symptoms: Low recall in crowded scenes. False negatives concentrated near scene edges and behind obstacles.

Mitigation: Include heavily occluded examples in training data with appropriate labels. Use soft labels for ambiguous cases. Apply specialized loss functions that handle partial visibility.

Class Imbalance

Background regions vastly outnumber object regions. In a typical image, 99%+ of anchor boxes are background. Without correction, the model learns to predict everything as background.

Mitigation: Focal loss down-weights easy negatives (confident background predictions) and focuses learning on hard examples. Hard negative mining explicitly samples difficult background regions. Both techniques are standard in modern detectors.

Domain Shift

Models trained on curated datasets struggle with real-world messiness. Rain, fog, motion blur, unusual camera angles, and lighting conditions absent from training data cause silent accuracy drops of 10-30%.

💡 Key Takeaways
Small object accuracy can be 20-30% lower than large objects due to insufficient feature resolution
Occlusion and truncation cause missed detections - include partially visible examples in training
99%+ of anchor boxes are background - focal loss and hard negative mining address this imbalance
Domain shift from training data to production causes 10-30% accuracy drops silently
📌 Interview Tips
1Interview Tip: When discussing failure modes, mention small objects first - this is where most detectors struggle
2Interview Tip: Explain focal loss as attention redistribution - it makes the model focus on hard examples
← Back to Object Detection (R-CNN, YOLO, Single-stage vs Two-stage) Overview
Failure Modes and Edge Cases in Production Object Detection | Object Detection (R-CNN, YOLO, Single-stage vs Two-stage) - System Overflow