Database DesignKey-Value Stores (Redis, DynamoDB)Hard⏱️ ~3 min

Operational Patterns: Caching Strategies and Global Deployments

Cache aside (lazy loading) for read heavy workloads has applications check cache first, fall back to origin database on miss, then populate cache. This pattern gives control over what gets cached but requires explicit invalidation logic. Facebook memcache uses cache aside with invalidation on write: when MySQL writes occur, applications delete corresponding cache keys forcing next read to fetch fresh data. Hit rates exceed 90% reducing MySQL read load by 10 times. Write through keeps cache and origin synchronized by writing to both on every update, adding latency to writes (cache latency plus database latency) but ensuring consistency. Write behind buffers writes in cache and flushes asynchronously to origin, improving perceived latency but risking data loss on cache failure. Stampede protection is critical when cache expires. Meta implements lease tokens: on cache miss, first requester receives a lease (token valid for seconds) to fetch from origin. Other concurrent requests see the pending lease and either wait briefly or return stale data. Once the lease holder populates cache, all subsequent requests hit. This collapses thousands of simultaneous misses into one origin request. Staggered time to live (TTL) with jitter prevents mass expiration: instead of 3600 second TTL for all keys, use 3600 plus or minus random(360) spreading expiration over 10% window. Probabilistic early refresh computes probability equal to time since last refresh divided by TTL; keys refresh before expiry with increasing probability avoiding simultaneous misses. Global multi region deployments for key value stores typically use eventual consistency with per region routing. DynamoDB global tables replicate across regions with last writer wins conflict resolution and typical cross region propagation under one second. Applications route reads to nearest region for low latency (50 milliseconds cross region versus 5 milliseconds local) and write to any region. Twitter Manhattan multi datacenter placement serves local reads with millisecond latency while background replication keeps regions synchronized. Monitor replication lag service level objectives (often sub second targets) and test failure drills isolating regions. Cost and capacity planning requires modeling memory overhead per item. A Redis item with 20 byte key and 100 byte value consumes approximately 200 bytes including allocator overhead, pointer structures, and metadata. Planned shard CPU and network interface card utilization should stay below 60 to 70% to absorb traffic bursts. For pay per request systems like DynamoDB, simulate key distribution to detect hot partitions before production; adaptive capacity helps but per partition hard limits can still throttle. Instrument p50/p95/p99 latencies, hit rates, top K hot keys, replication lag, partition skew, eviction rates, and compaction backlogs. Set error budgets for tail latency under maintenance and failover scenarios.
💡 Key Takeaways
Cache aside with invalidation on write achieves over 90% hit rates at Facebook reducing MySQL load by 10 times; write through adds latency (cache plus database) but ensures consistency
Lease tokens collapse thundering herd: first requester gets lease to fetch origin while 1000 concurrent requests wait, preventing database overload from simultaneous misses
Staggered TTL with jitter (3600 plus or minus 360 seconds) spreads cache expiration over 10% window preventing mass simultaneous misses; probabilistic early refresh avoids expiry storms
DynamoDB global tables replicate across regions with sub second propagation using last writer wins; local reads at 5 milliseconds versus 50 milliseconds cross region reduces user latency
Capacity planning requires 60 to 70% CPU/network utilization headroom for bursts; Redis item overhead approximately 2 times value size including allocator and pointer metadata
📌 Examples
Meta memcache lease tokens prevent cache stampede: on miss first request fetches MySQL while others wait milliseconds, achieving over 90% hit rates across regional pools
Twitter Manhattan multi datacenter serves local reads at millisecond latency with background replication keeping regions synchronized, handling millions of requests per second globally
Uber rate limiting uses Redis with sharded counters per geographic cell, monitoring top K hot keys and keeping CPU utilization below 70% to absorb traffic spikes without throttling
← Back to Key-Value Stores (Redis, DynamoDB) Overview
Operational Patterns: Caching Strategies and Global Deployments | Key-Value Stores (Redis, DynamoDB) - System Overflow