Fairness Metrics Failure Modes and Edge Cases
Simpson Paradox in Fairness
A model can appear fair in aggregate but be unfair in every subgroup. Example: overall hiring rates are equal (50% each for men and women). But in engineering, men at 60%, women at 40%. In marketing, men at 40%, women at 60%. If more women apply to marketing and more men to engineering, aggregates look fair while discrimination exists in every department. Always compute metrics at multiple granularities.
Intersectionality Blindness
Checking gender and race separately misses bias against intersectional groups. A model might be fair for women overall and for Black applicants overall, but biased against Black women. Intersectional groups are smaller. With 10,000 samples: 5,000 women (sufficient), 1,000 Black applicants (marginal), 200 Black women (insufficient for reliable metrics). Prioritize checking vulnerable intersections even with limited data.
Proxy Variable Leakage
You removed race, but zip code is 85% predictive of race. Removed gender, but first name is 90% predictive. Models find proxies. Detection: train a classifier to predict the protected attribute from model features. If AUC exceeds 0.7, significant proxy information leaks. Mitigation: adversarial debiasing or feature removal guided by proxy detection.
Feedback Loop Amplification
If a biased model approves fewer loans to Group A, Group A generates less data, making the model more uncertain, leading to fewer approvals. Bias amplifies over time. Detection: track metrics over time, not just at deployment. Mitigation: exploration mechanisms ensuring minimum representation of all groups in positive outcomes.