Allocating Privacy Budgets and Choosing Epsilon in Production
Privacy Budget Allocation: Choosing epsilon values and distributing budget across queries is a policy decision balancing privacy protection against analytical utility. There is no universally correct epsilon—the choice depends on data sensitivity, regulatory requirements, and acceptable accuracy loss.
Epsilon Guidelines
Practical epsilon ranges and their interpretations: epsilon less than 0.1: Very strong privacy. Outputs are nearly independent of any individual record. High noise, significant accuracy loss. Reserve for highly sensitive data (medical, financial). epsilon 0.1 to 1: Strong privacy. Individual influence is bounded to roughly 10-170% change in output probability. Standard for most privacy-sensitive applications. epsilon 1 to 10: Moderate privacy. Provides plausible deniability but determined attackers may succeed. Acceptable for less sensitive data or when utility requirements dominate. epsilon greater than 10: Weak privacy. Primarily useful for demonstrating DP compliance rather than meaningful protection.
Budget Distribution Strategies
Per-query allocation: Assign fixed epsilon to each query type. Simple but inflexible—all queries get same budget regardless of importance. Prioritized allocation: Critical queries get larger budgets; exploratory queries get smaller budgets. Requires query classification upfront. Adaptive allocation: Start with small epsilon, increase for queries that add significant value. Complex to implement but maximizes utility. Per-user budgets: Each user has their own budget, preventing one analyst from exhausting shared budget.
Renewal Policies
Privacy budget is consumed over time. Options: Lifetime budget: Once exhausted, no more queries allowed on that dataset. Strongest protection but limits long-term analysis. Annual renewal: Budget resets each year. Assumes adversary cannot combine cross-year queries (may not be true). Rolling window: Budget covers last N days of queries. Balances ongoing analysis with finite exposure. Choose based on threat model and business needs.
Starting Point: For most production systems, start with epsilon=1 per query, total budget of epsilon=10 per year, and monitor utility. Adjust based on actual accuracy requirements and privacy incident risk tolerance.