Rate Limiting • Fixed vs Sliding WindowMedium⏱️ ~2 min
Sliding Log: Exact Timestamps for Precise Counting
A sliding log rate limiter stores the exact timestamp of every request for each key in a sorted data structure (deque, sorted set, or time series). On each new request, you evict all timestamps older than your window (current_time minus window_size), then check if the remaining count exceeds the threshold. This gives you perfect accuracy over a true rolling window with zero approximation error.
The cost is memory proportional to the request rate times the window size per key. At 1,000 requests per second for a single key over a 60 second window, you must retain 60,000 timestamps. With 8 byte timestamps, that is 480 KB per key. Scale to 10,000 active keys and you need 4.8 GB just for timestamps. At 10,000 requests per second (a hot API key or popular user), a 5 minute window requires 3 million timestamps per key, or 24 MB per key, making this approach untenable at high scale.
This technique is practical only for low rate, high value scenarios. Authentication endpoints checking failed login attempts might allow 10 attempts per 10 minutes per account. Even with millions of accounts, only a tiny fraction are actively logging in at any moment, and 10 timestamps times 8 bytes equals just 80 bytes per active account. You get exact fairness and zero boundary gaming, which matters when the cost of a false positive (locking out a legitimate user) or false negative (allowing a brute force attack) is high.
Distributed enforcement is challenging because you must either centralize state (single point of failure, network round trip on every request) or replicate logs and merge them (complex, high bandwidth). Most production systems avoid sliding log at scale, preferring bucketed sliding counters or token buckets.
💡 Key Takeaways
•Memory is O(rate times window_size) per key. At 1,000 req/sec over 60 seconds, 60,000 timestamps at 8 bytes each equals 480 KB per key. Compare to 120 bytes for bucketed sliding counter with 60 buckets.
•Perfect accuracy with zero approximation error. Every request is counted exactly within the rolling window, eliminating boundary spikes and bucket granularity issues.
•Practical only for low rate scenarios like authentication (10 login attempts per 10 minutes per user). With 100,000 active login sessions, 10 timestamps times 8 bytes times 100,000 equals 8 MB, manageable.
•Eviction cost is O(expired_count) per request. Under uniform traffic this amortizes well, but bursty traffic causes spikes. Use sorted sets or deques with efficient range deletion.
•Distributed deployment requires either centralized state (single point of failure, added latency) or complex log replication and merging. Most production systems avoid this by using approximations.
•High cardinality keys cause state explosion. A credential stuffing attack with 1 million attempted usernames times 10 failed attempts times 8 bytes equals 80 MB. Sustainable, but scaling to millions of req/sec is not feasible.
📌 Examples
Authentication rate limiting: allow 5 failed login attempts per 300 seconds per username. Store timestamps [1699100105, 1699100147, 1699100203, 1699100298, 1699100310] for user alice. At t=1699100400, evict timestamps < 1699100100 (none in this case after initial eviction), count = 5, deny next attempt.
Redis sorted set implementation: ZADD ratelimit:alice 1699100105 req1, ZADD ratelimit:alice 1699100147 req2, etc. On new request at t: ZREMRANGEBYSCORE ratelimit:alice 0 (t - 300). Then ZCARD ratelimit:alice. If count < 5, ZADD and allow; else deny.
Memory explosion at scale: API key with 10,000 req/sec, 5 minute window. 10,000 × 300 = 3,000,000 timestamps × 8 bytes = 24 MB per key. With 1,000 hot keys, 24 GB just for rate limit state, plus Redis overhead pushes past 50 GB.