Per User Rate Limiting: Identity Based Quota Enforcement
Why Identity Matters
Per user limiting is the most common and fairest approach for authenticated APIs. Each user gets their allocation (say 1,000 requests/hour) regardless of which IP they use or how many devices they have. A user on mobile, laptop, and tablet all share the same user budget. This is what users expect when paying for an API plan.
Key Design: User ID Extraction
The rate limiting key is the user identifier from the authentication layer. Common patterns: JWT sub claim, API key lookup, session user ID. This must happen AFTER authentication, so per user limits only apply to authenticated endpoints. Key format: ratelimit:user:{user_id}:{window}. Use consistent hashing if sharding across multiple Redis instances.
Tiered Limits by Plan
Different users get different limits based on their subscription tier. Free tier: 100/hour. Pro tier: 10,000/hour. Enterprise: 100,000/hour. Store the user limit in a cache (lookup user tier, get limit) or embed in the token (JWT claim with rate limit). Changing a user plan should take effect immediately; use cache invalidation or short cache TTLs.
Advantages and Limitations
Advantages: fair allocation, directly tied to billing, users understand their quota, enables plan based differentiation. Limitations: requires authentication (cannot protect login/signup endpoints), account sharing bypasses limits (10 people using one account), compromised credentials get full quota for abuse. Per user limits are necessary but not sufficient for complete protection.