Bias Mitigation: Pre, In, and Post Processing Techniques
Pre-Processing Techniques
Fix bias before training. Resampling: Oversample underrepresented groups or undersample overrepresented ones. If Group B has 10% of training data, duplicate Group B examples until balanced. Risk: overfitting to duplicates. Reweighting: Assign higher weights to underrepresented samples during training. Group B samples count 10x more than Group A. Less prone to overfitting than resampling. Representation editing: Transform features to remove protected attribute information. Project embeddings to be orthogonal to protected attribute direction. Risk: losing predictive signal that correlates with but is not caused by protected attributes.
In-Processing Techniques
Add fairness constraints during training. Adversarial debiasing: Train model alongside an adversary that tries to predict protected attribute from model output. Main model is penalized if adversary succeeds. Forces model to make predictions uninformative about group membership. Constrained optimization: Add fairness constraint directly to loss function. Minimize prediction error subject to demographic parity ratio above 0.8. Lagrangian relaxation converts hard constraints to soft penalties.
Post-Processing Techniques
Adjust predictions after model is trained. Threshold optimization: Different decision thresholds per group. If Group A threshold is 0.5, Group B might use 0.4 to equalize rates. Simple, interpretable, but treats symptoms not causes. Calibration: Fit separate probability mappings per group. If model says 70% for Group A but actual rate is 50%, recalibrate Group A predictions downward. Does not fix underlying bias but produces fair probabilities.