Resilience & Service PatternsLoad Shedding & BackpressureHard

Adaptive Load Shedding: Self-Tuning Systems

The Problem with Static Thresholds

Static thresholds like shed at 80% CPU assume constant system characteristics. But capacity varies with workload mix, time of day, and code changes. A threshold tuned for read heavy traffic fails with write heavy traffic. Static thresholds require constant manual adjustment.

Little Law Based Adaptation

Little Law states: L = λ × W (L = requests in system, λ = arrival rate, W = time per request). Rearranged: capacity = concurrent_requests / latency. Target 100ms latency with 500 concurrent requests gives 5000 RPS capacity. When measured latency exceeds target, shedding activates.

💡 Key Insight: Adaptive shedding uses observed behavior rather than predicted capacity. The system discovers its limits through continuous measurement, automatically adjusting for workload changes.

CoDel: Controlled Delay Algorithm

CoDel tracks queue sojourn time (how long requests wait). Target might be 5ms. If requests consistently wait longer over 100ms, CoDel starts dropping. Drop frequency increases: first at 100ms, next at 70ms, following interval/sqrt(count). This creates aggressive shedding during sustained overload while ignoring brief bursts.

AIMD: Additive Increase Multiplicative Decrease

Borrowed from TCP congestion control. Start accepting all requests. On success, increase capacity: capacity += 10 RPS. On overload signal, cut by half: capacity *= 0.5. The asymmetry ensures fast recovery from overload while cautiously exploring capacity increases.

Gradient Based Adaptation

Compute rate of change of metrics. Latency increasing 5ms/second predicts overload before thresholds trigger. Formula: shedding_rate = max(0, gradient × sensitivity). This proactively sheds before overload fully manifests.

💡 Key Takeaways
Static thresholds fail when workload characteristics change; adaptive systems self-tune
Little Law (L = λ × W) enables computing real-time capacity from measured latency
AIMD pattern: slow increase (additive), fast decrease (multiplicative) on overload
📌 Interview Tips
1Explain Little Law application: capacity = concurrent_requests / latency, show the math
2CoDel is interview gold - shows knowledge of network congestion control applied to services
3Mention gradient-based adaptation as proactive shedding before thresholds trigger
← Back to Load Shedding & Backpressure Overview