Multi Region Caching, Invalidation, and Trade Offs Between Consistency and Latency
Independent Regional Caches
Most common pattern: each region (us east, eu west, asia pacific) has its own cache cluster with no cross region coherence. Database is single source of truth. Writes go directly to origin; cache invalidations propagate to regional caches independently via async event streams. Minimizes latency: reads served from in region memory in under 1ms. Avoids expensive cross region transfers (50-150ms latency, $0.01-0.02/GB). Accepts eventual consistency: writes in us east might not invalidate eu west cache for seconds to tens of seconds.
Global Cache Layer for Critical Data
For globally shared hot data (popular content metadata, configuration), maintain a small global cache with active replication. Writes synchronously or asynchronously replicate to all regions. Stronger consistency at cost of increased write latency (must wait for cross region replication) and complexity (conflict resolution, replication lag monitoring). Use only for high value, low volume data; keep bulk of data in regional caches with eventual consistency.
Push Invalidation
When content changes at origin, invalidation message sent to all regional caches globally, propagating in seconds to minutes. CDN edge caches use this extensively. Trade off: invalidation complexity and cost for improved global consistency. Version based invalidation helps: include monotonic version number in cached values, reject stale entries when higher version known, limiting impact of delayed invalidations.
The Fundamental Trade off
Cross region reads add 50-150ms p50, 200ms+ p99. Cross region transfer at 1 TB/s costs $36,000-72,000/hour. Same region reads: under 1ms, minimal cost. Production strongly prefers region local reads, using multi region replication only for high value low volume data, accepting staleness windows of seconds to minutes for bulk of cached data.