Token Bucket Algorithm: Core Mechanics and Burst Control
The Arcade Analogy
Imagine an arcade where game tokens drip into your personal cup at a fixed rate, say one token per second. You can hold up to 20 tokens max (the bucket capacity). If you have saved up 15 tokens and want to play 10 games quickly, you can. But if you have zero tokens and a game costs one, you wait until the next token drips in. This simple rule balances immediate responsiveness with sustained limits.
Why This Matters for APIs
Token bucket solves a critical problem: how do you let users send bursts of requests when needed (loading a dashboard with 50 API calls) while protecting servers from sustained overload? Token bucket says: "burst up to your saved tokens, then slow to refill rate." A user idle for 30 seconds with rate equals 100/sec and capacity equals 200 can immediately send 200 requests, then sustain 100/sec afterward. Responsive yet safe.
The Two Parameters
Every token bucket has exactly two settings. First, refill rate (r): tokens per second. Second, bucket capacity (b): maximum saved tokens. With r=100, b=500, an idle user accumulates up to 500 tokens. They can immediately burst 500 requests, then sustain 100/sec as new tokens arrive. These two numbers completely define traffic shaping behavior.
Why Token Bucket Beats Simple Counting
Naive approach: max 100 requests per second, reset every second. Problem: user sends 100 at second 0.9 and 100 at second 1.1, that is 200 requests in 0.2 seconds while staying "within limits" on each boundary. Token bucket avoids this: consuming 100 tokens at 0.9s means only 20 tokens available at 1.1s (200ms times 100 tokens/sec refill). Burst is limited by actual token availability, not arbitrary boundaries.
Mathematical Guarantee
Over any time interval T, token bucket allows at most (r times T + b) requests. Rate 100/sec with capacity 500 allows at most 1,100 requests in any 10 second window: 100 times 10 plus 500 equals 1,100 maximum, 1,000 sustained. This predictable worst case makes capacity planning straightforward.