Fraud Detection & Anomaly DetectionHandling Imbalanced Data (SMOTE, Class Weighting, Focal Loss)Hard⏱️ ~3 min

Failure Modes and Edge Cases in Imbalanced Data Handling

SMOTE Boundary Violations

SMOTE interpolates blindly in feature space. If minority samples lie near the decision boundary, synthetic samples may actually belong to the majority class. The model learns incorrect class assignments, hurting rather than helping performance. Borderline-SMOTE addresses this by only interpolating between minority samples both of whose neighbors are also minority.

Warning: Always validate SMOTE on a holdout set with natural distribution. If validation metrics drop after applying SMOTE, the synthetic samples are likely introducing noise.

Extreme Class Weights

With 1:10000 imbalance, naive class weighting assigns 10000x weight to minority errors. This causes gradient explosion—single minority examples dominate entire batches. Gradients become unstable, loss oscillates, model fails to converge. Cap weights at 100-1000x maximum, or use focal loss which naturally limits extreme gradients.

Label Noise Amplification

Minority class labels are often noisier—fraud cases may be mislabeled due to incomplete investigation. Class weighting amplifies this noise. A mislabeled minority example with 1000x weight causes 1000x more damage than a mislabeled majority example. Clean minority class labels carefully before applying weighting techniques.

Detection Tip: Monitor loss curves for individual classes. If minority class loss spikes or oscillates while majority class loss is stable, weights may be too extreme or labels may be noisy.

Evaluation on Wrong Distribution

Evaluating on rebalanced test sets gives misleading metrics. A model with 90% recall on a 1:1 test set might have 50% recall on the natural 1:1000 distribution. Always evaluate on the distribution the model will see in production. Rebalancing is only for training, never for evaluation.

💡 Key Takeaways
SMOTE may create synthetic samples in majority class regions if minority samples are near decision boundary
Cap class weights at 100-1000x maximum to prevent gradient explosion from extreme imbalance
Class weighting amplifies label noise—clean minority class labels carefully before applying weighting
📌 Interview Tips
1Monitor per-class loss curves: minority loss spikes or oscillation indicates weights too extreme or noisy labels
2Always evaluate on natural distribution—rebalanced test sets give misleading metrics
← Back to Handling Imbalanced Data (SMOTE, Class Weighting, Focal Loss) Overview