Rate Limiting Algorithms: Fixed Window, Sliding Window, and Token Bucket
Algorithm Choice in Distributed Context
The algorithms from single server rate limiting all work in distributed settings, but their coordination costs differ significantly. Choosing the wrong algorithm can add 50% latency overhead or allow 2x your intended rate.
Fixed Window: Cheapest, Least Accurate
One Redis key per user per window: INCR user:123:minute:202401151430 with TTL. One atomic operation, 0.5 ms round trip. Problem: boundary bursts. User sends 100 at 14:29:59 and 100 at 14:30:01, getting 200 in 2 seconds. Good for: daily quotas, billing limits, where exact window boundaries are acceptable.
Sliding Window Counter: Balanced Trade-off
Keep two fixed window counters (current and previous) and compute weighted average. Two Redis reads per request (can be pipelined). Formula: count = current + (previous × overlap_percentage). Cost: 1 ms. Eliminates boundary burst problem. Good for: most API rate limiting where fairness matters.
Token Bucket: Best for Burst Control
Store last refill time and current token count. Each request calculates tokens refilled since last check, subtracts cost, updates atomically via Lua script. Cost: 0.5 to 1 ms. Provides fine grained burst control. Good for: protecting fragile backends, API gateway rate limiting, burst capacity control.
Sliding Log: Most Accurate, Most Expensive
Store every request timestamp in a Redis sorted set. Each request: ZREMRANGEBYSCORE, ZCARD, ZADD. Cost: 1 to 3 ms plus O(n) memory. Perfect accuracy but high cost. Good for: billing, compliance, low rate limits where accuracy justifies expense.