Resilience & Service PatternsLoad Shedding & BackpressureMedium

Priority-Based Load Shedding: What to Drop First

Request Priority Classification

Not all requests have equal value. A payment completion is worth more than a recommendation refresh. Classify requests into priority tiers: Critical (authentication, payments, core transactions) should never be shed unless necessary. Important (user data reads, search) can be delayed but should complete. Optional (analytics, logging, background jobs) can be shed freely. Systems typically use 3-5 priority levels with separate shedding thresholds.

Implementation Strategies

Priority headers propagate through service calls. API gateway assigns priority based on endpoint and user tier: X-Request-Priority: critical. Downstream services apply shedding rules: at 70% CPU, shed optional. At 85% CPU, shed important. Only at 95% CPU consider shedding critical.

💡 Key Insight: Priority shedding transforms binary accept/reject into a gradient. During moderate overload, only optional requests fail. Users experience degraded but functional service rather than complete outage.

User Tier Considerations

Paid users expect higher reliability. During overload, free tier requests shed at 60% capacity while premium users retain access until 85%. A system at 10,000 RPS might allocate 7,000 RPS to premium and 3,000 RPS to free users.

Probabilistic Shedding

Rather than hard cutoffs, probabilistic shedding rejects a percentage. At 80% capacity, reject 10% of optional requests. At 90%, reject 50%. Formula: shedding_rate = (current_load - threshold) / (max_capacity - threshold). At 95% with threshold 80%, shed (95-80)/(100-80) = 75% of eligible requests.

Fairness in Shedding

Random shedding can unfairly penalize unlucky users repeatedly. Token bucket per client ensures fair distribution: each client gets tokens, requests consume tokens, empty bucket means rejection. This prevents one client from being repeatedly shed while another always succeeds.

💡 Key Takeaways
Classify requests into 3-5 priority tiers: critical (payments), important (reads), optional (analytics)
Graduate shedding thresholds: shed optional at 70%, important at 85%, critical at 95% CPU
Probabilistic shedding creates smoother degradation than hard cutoffs
📌 Interview Tips
1In interviews, explain how you would prioritize: checkout flow is critical, product recommendations are optional
2Mention user tier differentiation - paid users get higher shedding thresholds than free users
3Show the formula: shedding_rate = (current_load - threshold) / (max_capacity - threshold)
← Back to Load Shedding & Backpressure Overview
Priority-Based Load Shedding: What to Drop First | Load Shedding & Backpressure - System Overflow