Learn→Privacy & Fairness in ML→Fairness Metrics (Demographic Parity, Equalized Odds)→5 of 6

Privacy & Fairness in ML • Fairness Metrics (Demographic Parity, Equalized Odds)Hard⏱️ ~2 min

Fairness Metrics Failure Modes and Edge Cases

Simpson Paradox in Fairness
A model can appear fair in aggregate but be unfair in every subgroup. Example: overall hiring rates are equal (50% each for men and women). But in engineering, men at 60%, women at 40%. In marketing, men at 40%, women at 60%. If more women apply to marketing and more men to engineering, aggregates look fair while discrimination exists in every department. Always compute metrics at multiple granularities.
Intersectionality Blindness
Checking gender and race separately misses bias against intersectional groups. A model might be fair for women overall and for Black applicants overall, but biased against Black women. Intersectional groups are smaller. With 10,000 samples: 5,000 women (sufficient), 1,000 Black applicants (marginal), 200 Black women (insufficient for reliable metrics). Prioritize checking vulnerable intersections even with limited data.
Proxy Variable Leakage
You removed race, but zip code is 85% predictive of race. Removed gender, but first name is 90% predictive. Models find proxies. Detection: train a classifier to predict the protected attribute from model features. If AUC exceeds 0.7, significant proxy information leaks. Mitigation: adversarial debiasing or feature removal guided by proxy detection.
Feedback Loop Amplification
If a biased model approves fewer loans to Group A, Group A generates less data, making the model more uncertain, leading to fewer approvals. Bias amplifies over time. Detection: track metrics over time, not just at deployment. Mitigation: exploration mechanisms ensuring minimum representation of all groups in positive outcomes.
💡 Key Insight: A single deployment check is insufficient. Bias emerges at different granularities, through proxies, and amplifies over time. Continuous multi-level monitoring is essential.

💡 Key Takeaways

✓Simpson paradox: aggregate fairness can hide subgroup discrimination and vice versa

✓Intersectionality: checking gender and race separately misses bias against combined groups

✓Proxy variables (zip code, names) leak protected information even when explicitly removed

✓Feedback loops amplify bias over time as biased predictions affect future data

✓Single deployment check insufficient: need continuous multi-level monitoring

📌 Interview Tips

1Explain Simpson paradox: fair aggregate but biased in every department

2Mention proxy detection: classifier predicting protected attribute with AUC above 0.7 indicates leakage

← Back to Fairness Metrics (Demographic Parity, Equalized Odds) Overview