Stale While Revalidate (SWR) and Soft TTL vs Hard TTL
Stale While Revalidate (SWR)
Serve expired but still cached data immediately while asynchronously refreshing from origin in background. Users get instant 1-2ms response from stale cache; refresh happens invisibly. Next request sees fresh data. This completely eliminates user visible latency impact of cache refresh. Trade off: users see slightly stale data during refresh window (typically 50-200ms) but the staleness is bounded and acceptable for most use cases.
Soft TTL vs Hard TTL
Soft TTL: time after which content is considered stale but still servable during revalidation. Hard TTL: time after which content must not be served at all and a blocking fetch is required. Example: soft=60s, hard=300s. Requests 0-60s: serve fresh directly. Requests 60-300s: serve stale immediately and trigger async refresh. Requests after 300s: block until fresh data fetched. The gap between soft and hard provides grace period for revalidation without serving arbitrarily old data.
HTTP Cache Control Header
Standard HTTP header support: Cache-Control: max-age=60, stale-while-revalidate=240. Content is considered fresh for 60s, then stale-servable for additional 240s (total 300s before forced refresh). CDNs and some application caches implement this natively. For custom caches, implement by storing timestamp alongside value and checking against both thresholds on each access.
Combining SWR with Request Collapsing
SWR alone still has stampede risk during the refresh phase: multiple concurrent requests might all trigger background refreshes simultaneously. Combine with collapsing: first request during stale window triggers refresh and acquires refresh lock; subsequent requests serve stale without triggering additional refreshes. This bounds origin load to exactly 1 request per TTL cycle regardless of traffic volume.