Failure Modes: Negative Transfer and Data Drift

Negative Transfer
Negative transfer occurs when adding a task hurts performance on existing tasks. Instead of helping each other, tasks compete for shared capacity. The multi-task model performs worse than separate single-task models.
Why it happens: Tasks may require conflicting features. Texture classification benefits from high frequency details. Shape classification benefits from smoothed, abstract features. Forcing both through the same backbone creates a compromise that serves neither well.
Detection: Compare multi-task model performance against single-task baselines. If any task is 2%+ worse in the multi-task setting, negative transfer is occurring.
Mitigation: Increase backbone capacity. Use soft parameter sharing instead of hard sharing. Add task-specific layers earlier in the network. In severe cases, remove the conflicting task from the multi-task setup.
Uneven Data Drift
Production data changes over time. In multi-task settings, tasks may drift at different rates. User behavior changes affect click prediction immediately. Seasonal patterns affect image classification gradually.
The problem: When you retrain on new data, one task improves dramatically while another barely changes or even degrades. The optimal retraining frequency differs per task, but multi-task models must be retrained as a unit.
Mitigation: Monitor per-task accuracy drift independently. If tasks drift at very different rates, consider decoupling them into separate models. Use task-specific calibration layers that can be updated independently.
Task Imbalance
When one task has 10x more training data than another, the model optimizes primarily for the data-rich task. The data-poor task gets insufficient gradient signal and underperforms.
Mitigation: Oversample minority tasks. Use loss weighting inversely proportional to data volume. Apply curriculum learning: start with balanced sampling, gradually shift toward natural distribution.

💡 Key Takeaways

✓Negative transfer: multi-task model performs worse than single-task baselines due to feature conflicts

✓Detect negative transfer by comparing against single-task baselines - 2%+ degradation indicates problems

✓Uneven data drift forces suboptimal retraining schedules when tasks change at different rates

✓Task imbalance from 10x data differences causes model to neglect minority tasks

📌 Interview Tips

1Interview Tip: Explain negative transfer as task competition for shared capacity - not all tasks benefit from sharing

2Interview Tip: Mention monitoring per-task drift separately as a production best practice for multi-task systems

← Back to Multi-task Learning Overview