Fraud Detection & Anomaly Detection • Graph-based Fraud Detection (GNNs)Medium⏱️ ~2 min
How Graph Neural Networks Learn Fraud Patterns
Graph Neural Networks (GNNs) perform message passing over the fraud graph to learn embeddings that capture relational risk. Each node starts with features such as historical spend, chargeback rate, geolocation variance, device entropy, and account age. Each edge has attributes including transaction amount, timestamp, channel type, and velocity metrics. In each GNN layer, a node aggregates information from neighbors within K hops, typically 10 to 25 neighbors per hop with K equals 2 or 3, then updates its embedding through learned transformations.
Heterogeneous GNNs are critical because different node types and edge types carry different semantics. The transformation applied when aggregating from device to user differs from user to merchant aggregations. The model learns separate weight matrices per node type and edge type, then fuses them. For example, shared device edges might get high attention weights when devices are new and velocity is high, while shared address edges matter more for established accounts. Temporal encoding adds recency signals, giving more weight to recent interactions. Edge weighting lets the model emphasize high risk relationships dynamically.
The output is a fraud score for a node (is this user risky), an edge (is this transaction fraudulent), or a small subgraph (is this cluster a fraud ring). Typical embedding sizes are 64 to 256 dimensions, balancing expressiveness and inference latency. Larger embeddings capture more nuance but increase memory footprint and computation.
Production systems handle severe class imbalance with focal loss that down weights easy negatives, cost sensitive learning that penalizes false negatives more heavily, and hard negative mining that selects challenging legitimate examples near the decision boundary. Precision at very low fraud rates matters most. If fraud is 0.5 percent of transactions, a model must achieve 90 percent recall at 1 percent false positive rate to be actionable, meaning it blocks 90 percent of fraud while only inconveniencing 1 percent of good users.
Temporal dynamics are essential. Many systems enforce strict time windows and apply decay functions to avoid leakage and handle drift. A half life of 7 to 30 days on neighbor contributions ensures the model adapts as attack patterns shift. Without temporal controls, aggregating future information during training inflates metrics, then production performance collapses because those future signals are unavailable at inference time.
💡 Key Takeaways
•Message passing aggregates K hops of neighbors (typically 10 to 25 neighbors per hop, K equals 2 or 3) and updates node embeddings through learned transformations
•Heterogeneous GNNs apply different weight matrices per node type and edge type, then fuse results because user to device relationships differ semantically from user to merchant
•Embedding dimensions of 64 to 256 balance expressiveness against inference latency and memory footprint in production serving
•Severe class imbalance (fraud is 0.5 percent of volume) requires focal loss, cost sensitive learning, and hard negative mining to achieve 90 percent recall at 1 percent false positive rate
•Temporal decay with half lives of 7 to 30 days prevents leakage and handles drift as attack patterns evolve, avoiding inflated training metrics that fail in production
📌 Examples
Attention mechanism example: New device (2 days old) connecting to user gets attention weight 0.8, while established device (200 days old) gets weight 0.3, letting model emphasize risky new relationships.
Hard negative mining: Model initially misclassifies high value legitimate transactions from new accounts. Training focuses on these examples near decision boundary to reduce false positives on good customers.
Temporal window enforcement: Training only aggregates edges from 7 days before the transaction timestamp, matching production constraints and preventing future information leakage that would overestimate recall by 15 to 20 percent.