Fraud Detection & Anomaly DetectionGraph-based Fraud Detection (GNNs)Hard⏱️ ~3 min

Implementation Details: Sampling, Caching, and Ensemble Fusion

Neighbor Sampling Strategies

Random uniform sampling is simplest but may miss important connections. Importance sampling weights neighbors by recency, transaction volume, or suspicion score—prioritizing informative nodes. Layer-wise sampling uses different strategies per hop: strict sampling in first hop for speed, relaxed in second hop for coverage. The sampling budget (total neighbors fetched) directly trades off latency vs accuracy.

Implementation Pattern: Cache sampled neighborhoods for recently active nodes. High-activity users get sampled repeatedly; caching their neighborhoods (with 5-minute TTL) reduces graph database load by 60-80% during traffic spikes.

Embedding Caches

Store pre-computed node embeddings in low-latency cache (Redis, Memcached). At inference, fetch cached embedding if fresh enough; otherwise compute on-demand and update cache. Two-tier caching: hot embeddings in memory (sub-millisecond), warm embeddings in Redis (2-5ms), cold nodes computed live (20-50ms).

Ensemble Fusion

Production systems rarely use GNNs alone. Combine GNN scores with point-wise model scores (XGBoost on transaction features), rule-based systems (velocity checks, blacklists), and reputation scores. Ensemble fusion options: weighted average, stacking (meta-model on component scores), or cascaded filtering (rules first, GNN for ambiguous cases).

Production Insight: Cascaded architecture saves compute: cheap rules filter 80% of clear cases, GNN inference runs only on the remaining 20% ambiguous transactions. This reduces GNN serving costs by 5x while maintaining detection quality.

Batch vs Real-time

Some fraud detection is not time-critical (account reviews, periodic sweeps). Run batch GNN inference overnight on the full graph without sampling constraints. Use batch results to update node risk scores that real-time systems consume. This hybrid approach balances thoroughness with latency requirements.

💡 Key Takeaways
Cache sampled neighborhoods for active nodes (5-minute TTL) to reduce graph database load by 60-80% during traffic spikes
Two-tier embedding cache: hot in memory (sub-ms), warm in Redis (2-5ms), cold computed live (20-50ms)
Cascaded architecture filters 80% with cheap rules, runs GNN only on 20% ambiguous cases—5x cost reduction
📌 Interview Tips
1Explain importance sampling: weight neighbors by recency or suspicion score rather than random uniform selection
2Mention batch overnight inference on full graph without sampling, updating risk scores that real-time systems consume
← Back to Graph-based Fraud Detection (GNNs) Overview
Implementation Details: Sampling, Caching, and Ensemble Fusion | Graph-based Fraud Detection (GNNs) - System Overflow