Failure Modes and Edge Cases in Imbalanced Data Handling
SMOTE Boundary Violations
SMOTE interpolates blindly in feature space. If minority samples lie near the decision boundary, synthetic samples may actually belong to the majority class. The model learns incorrect class assignments, hurting rather than helping performance. Borderline-SMOTE addresses this by only interpolating between minority samples both of whose neighbors are also minority.
Warning: Always validate SMOTE on a holdout set with natural distribution. If validation metrics drop after applying SMOTE, the synthetic samples are likely introducing noise.
Extreme Class Weights
With 1:10000 imbalance, naive class weighting assigns 10000x weight to minority errors. This causes gradient explosion—single minority examples dominate entire batches. Gradients become unstable, loss oscillates, model fails to converge. Cap weights at 100-1000x maximum, or use focal loss which naturally limits extreme gradients.
Label Noise Amplification
Minority class labels are often noisier—fraud cases may be mislabeled due to incomplete investigation. Class weighting amplifies this noise. A mislabeled minority example with 1000x weight causes 1000x more damage than a mislabeled majority example. Clean minority class labels carefully before applying weighting techniques.
Detection Tip: Monitor loss curves for individual classes. If minority class loss spikes or oscillates while majority class loss is stable, weights may be too extreme or labels may be noisy.
Evaluation on Wrong Distribution
Evaluating on rebalanced test sets gives misleading metrics. A model with 90% recall on a 1:1 test set might have 50% recall on the natural 1:1000 distribution. Always evaluate on the distribution the model will see in production. Rebalancing is only for training, never for evaluation.