Failure Modes: Concept Drift, Adversarial Attacks, and Cold Start
Concept Drift
Fraudsters adapt. When models learn to catch a pattern, fraudsters change tactics. Last month's attack vector becomes ineffective; new vectors emerge. A model trained on historical data degrades as the fraud landscape evolves.
Concept drift in fraud differs from natural drift. It is adversarial: fraudsters actively probe model boundaries and exploit weaknesses. Feature importance shifts as fraudsters discover which signals get them blocked and which do not. A feature that was highly predictive becomes useless once fraudsters learn to avoid it.
Adversarial Attacks
Fraudsters conduct probe attacks: small test transactions to map model behavior. They vary one feature at a time to identify thresholds. "How much can I spend before getting blocked? Which device fingerprints pass? What velocity triggers review?"
Defense requires obfuscation and non-determinism. Add random delays before blocking high-risk transactions so timing does not reveal threshold hits. Use randomized thresholds within a range. Limit the information leaked through block messages.
Cold Start
New accounts, new merchants, new devices have no history. The model relies on aggregate features (velocity, spend patterns) that do not exist yet. First-party fraud often uses newly created accounts specifically to avoid historical signals.
Monitoring for Failure
Track feature importance over time. If a top feature drops in importance rapidly, fraudsters may have adapted. Monitor false negative rate by segment: a sudden spike in chargebacks from a specific merchant category or device type indicates a new attack pattern the model misses.