Fraud Detection & Anomaly DetectionFeature Engineering (Temporal Patterns, Aggregations, Velocity)Hard⏱️ ~3 min

Failure Modes: Label Leakage, Skew, and Adversarial Evasion

Label Leakage in Temporal Features

Temporal features can accidentally encode future information. If you compute "transactions in next 24 hours" when labeling historical data, the model learns to detect fraud by seeing the future—which is unavailable at serving time. Always compute features using only data available at prediction time: past events, not future ones.

Warning: Label leakage inflates offline metrics dramatically. A model might achieve 99% AUC in testing but perform no better than random in production. Audit feature computation timestamps rigorously.

Training-Serving Skew

Features computed differently in training versus serving cause silent accuracy degradation. Common sources: different aggregation windows (training uses exact 24 hours, serving uses approximate), different time zone handling, different null value treatment. The model trains on one feature distribution but serves on another.

Point-in-Time Correctness

When training on historical data, compute features as they would have been at that moment. Do not use the current 30-day average for a transaction from 6 months ago—use the 30-day average as of that date. Feature stores with temporal versioning enable point-in-time queries. Without this, models learn patterns that never existed in real-time.

Validation Pattern: Log serving-time feature values. Periodically compare logged values to batch-recomputed values for the same transactions. Divergence indicates skew.

Adversarial Evasion

Fraudsters learn velocity thresholds and stay below them. If the model flags accounts with more than 10 transactions per hour, fraudsters limit to 9. Defense: use ratios relative to baseline rather than absolute thresholds, rotate feature definitions periodically, and combine multiple velocity signals so evading one does not evade all.

💡 Key Takeaways
Label leakage inflates offline metrics—model achieves 99% AUC in testing but random performance in production
Point-in-time correctness: compute features as they would have been at that moment, not using current values for historical data
Fraudsters learn thresholds—use ratios relative to baseline rather than absolute thresholds, rotate feature definitions
📌 Interview Tips
1Log serving-time feature values and compare to batch-recomputed values periodically to detect skew
2Common skew sources: different aggregation windows, time zone handling, null value treatment between training and serving
← Back to Feature Engineering (Temporal Patterns, Aggregations, Velocity) Overview