Rate LimitingFixed vs Sliding WindowHard⏱️ ~3 min

Failure Modes: Hot Keys, Clock Skew, and State Explosions

How Window Based Rate Limiting Fails

Understanding failure modes helps you choose the right algorithm and set appropriate limits. Window based approaches have specific vulnerabilities that emerge under production load, malicious traffic, or system misconfiguration. These are not theoretical: they cause real outages.

Key Explosion Attack

An attacker rotates through millions of unique identifiers (IPs, usernames, API keys). If you allocate a counter per key, memory grows linearly with attackers: 1 million keys times 16 bytes equals 16 MB just for rate limit state. With sliding log storing timestamps, it is much worse: 1 million keys times 1,000 requests times 8 bytes equals 8 GB. Mitigation: use approximate data structures (count min sketch, bloom filters) or hierarchical limits (global limit catches key explosion even if per key tracking fails).

Clock Skew Issues

Fixed window boundaries depend on server time. If Server A thinks it is 2:00:00 and Server B thinks it is 1:59:58, they disagree about which window a request belongs to. User hits both servers and gets double the intended rate. NTP typically keeps servers within 10 ms, but edge cases (network partition, VM migration, leap seconds) can cause larger drift. Mitigation: use central time source (Redis server time) or accept some imprecision.

Hot Key Problem

One viral API key or popular user generates 1,000x normal traffic. All requests for that key hit the same rate limit counter, creating a hot spot in Redis. At 50,000 requests/sec to one key, you are doing 50,000 INCR operations on one Redis key, exceeding single thread capacity. Mitigation: shard by key across multiple Redis instances, use local caching with periodic sync, or implement adaptive throttling that rejects hot keys earlier in the pipeline.

State Explosion from Long Windows

Sliding log with 24 hour window storing timestamps for 1,000 requests/day limit needs to store up to 1,000 timestamps per user for 24 hours. With 100,000 users, that is potentially 800 MB of rate limit state. Mitigation: use hybrid approach with fine grained sliding window for short term (1 minute) and fixed window for long term (daily), combining fairness with manageable memory.

💡 Key Takeaways
Hot key cardinality: 20 million unique IPs times 60 buckets times 2 bytes equals 2.4 GB raw counters. With hash table overhead, 5 to 6 GB. Botnet attacks can exhaust memory, causing fail open and letting attacks through.
Clock skew of 100 milliseconds across distributed nodes causes bucket misalignment. Request at boundary counted in different buckets by different enforcers, leading to aggregate errors of 10 to 20 percent in practice.
Boundary spike in fixed windows: client sends N requests at t equals 59.9 seconds and N at t equals 60.1 seconds. Server sees 2N requests in 200 milliseconds but each window shows N, within limit. Can double effective rate briefly.
Late events in stream windows: 5 minute tumbling window, 30 second allowed lateness. Events more than 30 seconds late are dropped (undercounting). Events within lateness trigger retractions, requiring idempotent downstream consumers.
State loss on rebalance: client consumes 50 percent of quota, node crashes, counter lost. Client resets to 0, consumes another full quota. Without durable state, enforcement is best effort only.
Thundering herd on refill: token buckets or fixed windows with synchronized resets cause all clients to send requests simultaneously. Backend sees step function load. Mitigate with jittered client retries and staggered server side refills.
📌 Interview Tips
1AWS WAF scaling challenge: Track 50 million unique IP addresses over 5 minute sliding windows. Even with 10 second buckets (30 buckets), 50,000,000 × 30 × 2 bytes = 3 GB raw counters per edge location. Must use approximate counters or offload to distributed storage.
2Clock skew scenario: Node A and Node B enforce per user limits. Node A clock is 120 milliseconds ahead. Request at true time 59.95 seconds: Node A assigns to window 1 (t equals 60.07 on its clock), Node B assigns to window 0. Aggregator sees inconsistent counts, may allow or deny incorrectly.
3Credential stuffing attack: Attacker tries 10 million username password pairs. Sliding log stores 10 failed attempts per username over 10 minutes. 10,000,000 × 10 × 8 bytes = 800 MB just for rate limit state. If attackers use 100 million usernames, 8 GB, plus overhead pushes memory limits.
← Back to Fixed vs Sliding Window Overview