Adaptive Load Shedding: Self-Tuning Systems

The Problem with Static Thresholds
Static thresholds like shed at 80% CPU assume constant system characteristics. But capacity varies with workload mix, time of day, and code changes. A threshold tuned for read heavy traffic fails with write heavy traffic. Static thresholds require constant manual adjustment.
Little Law Based Adaptation
Little Law states: L = λ × W (L = requests in system, λ = arrival rate, W = time per request). Rearranged: capacity = concurrent_requests / latency. Target 100ms latency with 500 concurrent requests gives 5000 RPS capacity. When measured latency exceeds target, shedding activates.
💡 Key Insight: Adaptive shedding uses observed behavior rather than predicted capacity. The system discovers its limits through continuous measurement, automatically adjusting for workload changes.
CoDel: Controlled Delay Algorithm
CoDel tracks queue sojourn time (how long requests wait). Target might be 5ms. If requests consistently wait longer over 100ms, CoDel starts dropping. Drop frequency increases: first at 100ms, next at 70ms, following interval/sqrt(count). This creates aggressive shedding during sustained overload while ignoring brief bursts.
AIMD: Additive Increase Multiplicative Decrease
Borrowed from TCP congestion control. Start accepting all requests. On success, increase capacity: capacity += 10 RPS. On overload signal, cut by half: capacity *= 0.5. The asymmetry ensures fast recovery from overload while cautiously exploring capacity increases.
Gradient Based Adaptation
Compute rate of change of metrics. Latency increasing 5ms/second predicts overload before thresholds trigger. Formula: shedding_rate = max(0, gradient × sensitivity). This proactively sheds before overload fully manifests.

💡 Key Takeaways

✓Static thresholds fail when workload characteristics change; adaptive systems self-tune

✓Little Law (L = λ × W) enables computing real-time capacity from measured latency

✓AIMD pattern: slow increase (additive), fast decrease (multiplicative) on overload

📌 Interview Tips

1Explain Little Law application: capacity = concurrent_requests / latency, show the math

2CoDel is interview gold - shows knowledge of network congestion control applied to services

3Mention gradient-based adaptation as proactive shedding before thresholds trigger

← Back to Load Shedding & Backpressure Overview