Priority-Based Load Shedding: What to Drop First

Request Priority Classification
Not all requests have equal value. A payment completion is worth more than a recommendation refresh. Classify requests into priority tiers: Critical (authentication, payments, core transactions) should never be shed unless necessary. Important (user data reads, search) can be delayed but should complete. Optional (analytics, logging, background jobs) can be shed freely. Systems typically use 3-5 priority levels with separate shedding thresholds.
Implementation Strategies
Priority headers propagate through service calls. API gateway assigns priority based on endpoint and user tier: X-Request-Priority: critical. Downstream services apply shedding rules: at 70% CPU, shed optional. At 85% CPU, shed important. Only at 95% CPU consider shedding critical.
💡 Key Insight: Priority shedding transforms binary accept/reject into a gradient. During moderate overload, only optional requests fail. Users experience degraded but functional service rather than complete outage.
User Tier Considerations
Paid users expect higher reliability. During overload, free tier requests shed at 60% capacity while premium users retain access until 85%. A system at 10,000 RPS might allocate 7,000 RPS to premium and 3,000 RPS to free users.
Probabilistic Shedding
Rather than hard cutoffs, probabilistic shedding rejects a percentage. At 80% capacity, reject 10% of optional requests. At 90%, reject 50%. Formula: shedding_rate = (current_load - threshold) / (max_capacity - threshold). At 95% with threshold 80%, shed (95-80)/(100-80) = 75% of eligible requests.
Fairness in Shedding
Random shedding can unfairly penalize unlucky users repeatedly. Token bucket per client ensures fair distribution: each client gets tokens, requests consume tokens, empty bucket means rejection. This prevents one client from being repeatedly shed while another always succeeds.

💡 Key Takeaways

✓Classify requests into 3-5 priority tiers: critical (payments), important (reads), optional (analytics)

✓Graduate shedding thresholds: shed optional at 70%, important at 85%, critical at 95% CPU

✓Probabilistic shedding creates smoother degradation than hard cutoffs

📌 Interview Tips

1In interviews, explain how you would prioritize: checkout flow is critical, product recommendations are optional

2Mention user tier differentiation - paid users get higher shedding thresholds than free users

3Show the formula: shedding_rate = (current_load - threshold) / (max_capacity - threshold)

← Back to Load Shedding & Backpressure Overview