Adaptive Load Shedding: Self-Tuning Systems
The Problem with Static Thresholds
Static thresholds like shed at 80% CPU assume constant system characteristics. But capacity varies with workload mix, time of day, and code changes. A threshold tuned for read heavy traffic fails with write heavy traffic. Static thresholds require constant manual adjustment.
Little Law Based Adaptation
Little Law states: L = λ × W (L = requests in system, λ = arrival rate, W = time per request). Rearranged: capacity = concurrent_requests / latency. Target 100ms latency with 500 concurrent requests gives 5000 RPS capacity. When measured latency exceeds target, shedding activates.
CoDel: Controlled Delay Algorithm
CoDel tracks queue sojourn time (how long requests wait). Target might be 5ms. If requests consistently wait longer over 100ms, CoDel starts dropping. Drop frequency increases: first at 100ms, next at 70ms, following interval/sqrt(count). This creates aggressive shedding during sustained overload while ignoring brief bursts.
AIMD: Additive Increase Multiplicative Decrease
Borrowed from TCP congestion control. Start accepting all requests. On success, increase capacity: capacity += 10 RPS. On overload signal, cut by half: capacity *= 0.5. The asymmetry ensures fast recovery from overload while cautiously exploring capacity increases.
Gradient Based Adaptation
Compute rate of change of metrics. Latency increasing 5ms/second predicts overload before thresholds trigger. Formula: shedding_rate = max(0, gradient × sensitivity). This proactively sheds before overload fully manifests.