Supervised Anomaly Detection: Why Accuracy Is Misleading in Imbalanced Classification
The Imbalance Problem
Anomalies are rare by definition. In fraud detection, typically 0.1-1% of transactions are fraudulent. In intrusion detection, 0.01% of network packets are malicious. This extreme class imbalance breaks standard machine learning assumptions.
Training data might contain 1 million normal examples and 1,000 fraud examples. A model that predicts "normal" for everything achieves 99.9% accuracy. That model catches zero fraud. Accuracy becomes meaningless when classes are imbalanced.
Why Accuracy Misleads
Accuracy = (correct predictions) / (total predictions). With 99.9% normal data, a trivial classifier that always predicts normal gets 99.9% accuracy. It sounds impressive but catches no anomalies. The metric rewards predicting the majority class and ignoring the minority class entirely.
Metrics That Matter
Precision: Of all predicted anomalies, what fraction are true anomalies? Low precision means many false alarms, wasting human review time.
Recall: Of all true anomalies, what fraction did we catch? Low recall means fraud slips through. In financial fraud, missing a ,000 theft might cost more than 100 false alarms.
PR-AUC: Area under precision-recall curve. Unlike ROC-AUC, PR-AUC is sensitive to class imbalance. A random classifier gets PR-AUC equal to the positive class fraction (0.001 for 0.1% fraud rate), not 0.5.