Privacy & Fairness in MLDifferential PrivacyHard⏱️ ~2 min

Failure Modes and Edge Cases in Differential Privacy

DP Failure Modes: Differential privacy implementations fail through budget exhaustion, sensitivity miscalculation, composition attacks, and auxiliary information attacks. Each failure can completely negate privacy guarantees while appearing to work correctly.

Budget Exhaustion

Every DP query consumes privacy budget. Run enough queries and the cumulative epsilon becomes so large that privacy is meaningless. Example: 1000 queries each with epsilon=0.01 sum to epsilon=10, which provides almost no privacy protection. Worse: an attacker who can submit queries directly can deliberately exhaust the budget. Mitigation: strict query limits, per-user rate limiting, budget reset policies, and monitoring for unusual query patterns.

Sensitivity Miscalculation

Noise must be calibrated to query sensitivity (how much one record can change the output). If sensitivity is underestimated, noise is too small and privacy leaks. Common mistakes: assuming bounded input ranges that are not actually enforced, ignoring that the same record might appear in multiple groups (join queries), or using average sensitivity when worst-case is required. Example: salary sum with assumed max 200,000 USD. If someone earns 1,000,000 USD, sensitivity is 5x higher than assumed, and the noise provides 5x less privacy.

Auxiliary Information Attacks

DP protects against inference from query results alone. But attackers may have auxiliary information. If an attacker knows 99 of 100 records in a group, a DP count of that group reveals the 100th record with high confidence despite the noise. DP does not protect against this—it only bounds what can be learned from the DP release itself. Combining DP releases with external data sources can circumvent privacy guarantees. Defense: minimize auxiliary information through data minimization and access controls.

Testing Strategy: Validate DP implementations with adversarial testing: attempt membership inference, try to reconstruct individual records, test edge cases with extreme values. If attacks succeed on test data, production privacy is compromised.

💡 Key Takeaways
Budget exhaustion: 1000 queries at epsilon=0.01 sums to epsilon=10 (no protection)
Sensitivity miscalculation: underestimated noise provides less privacy than claimed
Auxiliary information can circumvent DP guarantees despite correct implementation
📌 Interview Tips
1Salary sum with assumed 200K max: 1M earner creates 5x less protection than claimed
2Attacker knowing 99 of 100 records can infer the 100th despite DP noise
← Back to Differential Privacy Overview
Failure Modes and Edge Cases in Differential Privacy | Differential Privacy - System Overflow