Rate LimitingFixed vs Sliding WindowMedium⏱️ ~3 min

Sliding Window Counter: Trading Memory for Fairness

The Boundary Problem Solution

Fixed windows have a gap at boundaries: 200 requests can sneak through in 2 seconds when windows change. Sliding window counters fix this by maintaining a rolling view of time rather than discrete buckets. The key insight is that you can approximate a sliding window using two adjacent fixed windows with weighted counting, getting most of the benefits without storing individual timestamps.

The Analog Clock Analogy

Think of a clock with a moving 60 minute hand. At any moment, you count all requests from the past 60 minutes, not from the top of the hour. At 2:37, you count from 1:37 to 2:37. At 2:38, you count from 1:38 to 2:38. The window is always exactly 60 minutes, but its boundaries continuously shift forward with time.

Weighted Approximation Algorithm

Keep two fixed window counters: current and previous. When a request arrives at 2:37:30 (37.5 minutes into the hour): previous window count is 70, current window count is 40, time elapsed in current window is 37.5 minutes, remaining weight from previous is (60 - 37.5) / 60 = 0.375. Estimated count is 40 + (70 × 0.375) = 66.25. Compare this against your limit (say 100). This approximation assumes uniform distribution of requests within each window.

Why Approximation is Good Enough

The weighted approximation can be off by at most 1 window worth of requests in edge cases where traffic is extremely non uniform. In practice, with typical traffic patterns, error is under 10%. This is acceptable for rate limiting where the goal is protection, not perfect accounting. For billing where exact counts matter, use sliding log instead.

Implementation Trade-offs

Memory: O(2) per user (two counters instead of one). Processing: O(1) per request (read two counters, compute weighted sum). Compared to fixed window, you double the storage but eliminate the boundary burst problem. Compared to sliding log (storing every timestamp), you use dramatically less memory while achieving similar fairness.

💡 Key Takeaways
Memory scales with buckets times keys. For 10 million keys, 60 one second buckets at 2 bytes each equals 1.2 GB raw counters, typically 2 to 3 GB with overhead. Compare to 160 MB for fixed windows.
Approximation error bounded by bucket width times peak rate. With 5 second buckets and 100 req/sec bursts, error can reach plus or minus 500 requests. Finer buckets cost more memory but improve accuracy.
AWS WAF uses sliding 5 minute windows per IP with 1,000 to 2,000 request thresholds, enforcing at edge locations with subsecond latency. This prevents boundary gaming that affects fixed windows.
Update complexity is O(1) per request: increment current bucket and running sum. Bucket rotation is amortized O(1) as you drop one bucket and zero it when time advances.
Distributed deployment requires clock synchronization. Skew of 100 milliseconds can misalign buckets across nodes, causing divergent counts. Use epoch aligned boundaries and enforce at threshold times 0.95 as a guard band.
Hot key cardinality explosions are a real failure mode. A botnet with 20 million unique IPs times 60 buckets times 2 bytes equals 2.4 GB just for counters, risking memory exhaustion without approximation techniques.
📌 Interview Tips
1AWS WAF configuration: window = 300 seconds (5 minutes), bucket = 5 seconds, 60 buckets per IP. Threshold = 2,000 requests. Client at IP 203.0.113.45 sends 1,950 requests spread over 5 minutes (allowed), then bursts 100 in one second. Rolling sum = 2,050, triggers mitigation.
2Implementation: circular buffer of 60 counters, head pointer, running_sum. On event at t: bucket_index = (t / bucket_width) mod 60. If bucket_index != head, advance head (running_sum -= buckets[head], buckets[head] = 0, head = bucket_index). Then buckets[bucket_index] += 1, running_sum += 1.
3Memory sizing for Reddit scale: assume 500,000 active API clients, 60 second window, 1 second buckets. 500,000 × 60 × 2 bytes = 60 MB counters. With hash table overhead factor of 2, total approximately 120 MB, manageable on single node.
← Back to Fixed vs Sliding Window Overview