Hot Keys, Hot Partitions, and Thundering Herd Mitigation

The Hot Key Problem
Skewed access patterns create hot keys that overload a single partition while others sit idle. Even with ample cluster capacity, a hot partition receiving 80% of traffic throttles requests. P99 latency (response time at 99th percentile, meaning 99% of requests are faster) can spike from 5ms to 500ms for hot key requests.
Hot Partition Causes
Viral content, celebrity users, popular products, or time-based keys (all todays events hitting one partition) create natural hot spots. Sequential keys like auto-increment IDs concentrate recent writes on one partition. The hash function distributes evenly across key space, but actual access patterns are rarely uniform.
Mitigation: Key Sharding
Spread hot keys across partitions by adding random suffixes. Instead of product:viral_item:count, use product:viral_item:shard_N:count with N ranging 0-99. Writes distribute across 100 partitions. Reads scatter-gather across all shards and aggregate. Trades simplicity for load distribution.
Mitigation: Caching
Application-level caching absorbs hot key traffic before reaching the database. Cache popular items in application memory or a dedicated cache layer. With 99% cache hit rate, the database sees 100x less traffic for hot keys.
Thundering Herd
When a cached hot keys TTL (Time-To-Live, the expiration duration) expires, hundreds of concurrent requests hit the database simultaneously. The database overloads, requests timeout, all retry, compounding the problem. Solutions: staggered expiration (add random jitter to TTL), probabilistic early refresh (refresh before expiration based on access rate), or request coalescing (one request refreshes while others wait on the result).

💡 Key Takeaways

✓Hot keys overload single partitions even when cluster capacity is ample, spiking P99 from 5ms to 500ms

✓Sequential keys like auto-increment IDs concentrate writes on one partition

✓Key sharding with random suffixes distributes hot key load across many partitions

✓Caching absorbs hot key traffic: 99% cache hit rate means 100x less database load

✓Thundering herd occurs when cache expires; mitigate with TTL jitter or request coalescing

📌 Interview Tips

1Identify hot key scenarios proactively: viral content, popular products, celebrity users

2Propose key sharding with specific shard counts (10, 100) when discussing hot key mitigation

3Explain thundering herd and mention TTL jitter or probabilistic refresh as solutions

← Back to Key-Value Stores (Redis, DynamoDB) Overview