Privacy & Fairness in MLDifferential PrivacyHard⏱️ ~3 min

Production System Architecture for Differential Privacy

DP System Architecture: Production differential privacy requires infrastructure for privacy budget tracking, noise calibration, query validation, and audit logging. A single misconfigured query can exhaust the entire privacy budget or leak raw data.

Privacy Budget Manager

The central component tracking cumulative epsilon spent across all queries. Each query specifies its epsilon cost; the manager deducts from the budget and rejects queries that would exceed limits. Implementation: maintain per-dataset budget counters, log every query with timestamp and epsilon cost, alert when budget approaches threshold (e.g., 80% consumed). Budget decisions are policy choices: annual budget reset, per-user budgets, or lifetime budgets. The manager enforces whatever policy is configured.

Query Validation Layer

Before executing any query, validate it satisfies DP constraints. Check: sensitivity is bounded (query cannot return unbounded values), noise mechanism matches query type (Laplace for counts, Gaussian for vectors), epsilon is within allowed range for this query type. Reject malformed queries before they touch data. This layer prevents accidental privacy violations from analyst errors. Common pattern: provide a limited query API with pre-approved query templates rather than arbitrary SQL access.

Noise Generation Service

Centralized service for generating cryptographically secure random noise. Requirements: use cryptographic RNG (not pseudo-random), generate noise with correct distribution (Laplace, Gaussian), scale noise to query sensitivity and epsilon. Never reuse noise across queries—each query needs fresh randomness. Audit trail: log noise parameters (not values) for each query to enable privacy accounting verification.

Defense in Depth: Multiple layers prevent privacy breaches: query validation catches malformed requests, budget manager blocks excessive queries, audit logs enable post-hoc investigation. No single component failure should leak raw data.

💡 Key Takeaways
Budget manager tracks cumulative epsilon and rejects queries exceeding limits
Query validation layer rejects malformed queries before touching data
Use cryptographic RNG for noise, never reuse noise across queries
📌 Interview Tips
1Alert when 80% of privacy budget consumed
2Pre-approved query templates safer than arbitrary SQL access
← Back to Differential Privacy Overview
Production System Architecture for Differential Privacy | Differential Privacy - System Overflow