Per Key Request Collapsing with Leases and Mutexes

Request Collapsing Concept
Instead of N concurrent requests each fetching from origin, collapse them into 1 fetch. One requester becomes the leader and fetches; others wait for the result and share it. This reduces origin load from N to 1 during stampede windows, completely eliminating the amplification problem. Implementation requires per key coordination mechanism to elect a single leader and hold all other waiters until the fetch completes.
Lease Based Collapsing
On cache miss, cache server issues a short lived lease token (typically 10-60 seconds) to the first requester, granting exclusive right to refill that key. Other requesters receive indication that refill is in progress and either wait for completion or receive stale data if available. When the leader writes to cache, it includes the lease token; cache validates the token matches before accepting the write. This prevents stale data from racing in after a later valid refill completes. If lease expires without successful write, the next requester gets a new lease.
Mutex/Lock Based Collapsing
Application acquires distributed lock using cache itself or separate coordination service before fetching from origin. Command: SET key:lock NX EX 30 acquires lock atomically with 30 second TTL. If lock acquired (returns OK), fetch from origin and write to cache. If not acquired (returns nil), either wait with exponential backoff and retry, or return stale data immediately. Lock TTL must exceed expected fetch latency with margin for retries and network variability.
Trade offs
Leases require cache server support for token tracking and validation. Mutex based works with any cache but requires separate lock key namespace and careful TTL tuning. Both approaches add latency for waiters (50-200ms typical). If leader fails mid fetch, waiters may timeout waiting. Set waiter timeout shorter than lock TTL to bound worst case latency while still giving leader time to complete.

💡 Key Takeaways

✓Request collapsing: one requester fetches, others wait and share result. Reduces origin load from N to 1.

✓Lease based: cache issues short lived token (10-60s) to first requester. Token validated on write to prevent races.

✓Mutex based: SET key:lock NX EX 30 acquires lock atomically. If not acquired, wait with backoff or return stale.

✓Trade off: adds 50-200ms latency for waiters. Leader failure causes waiter timeouts until lock expires.

📌 Interview Tips

1Lease flow: miss → cache issues lease → requester fetches → writes with lease → cache validates token → accepts write.

2Mutex: SET product:123:lock NX EX 30. Returns OK → fetch. Returns nil → sleep 10ms, retry or return stale.

3Timeout tuning: lock TTL 30s, fetch 50ms typical, waiter timeout 5s. Covers retries while bounding worst case.

← Back to Cache Stampede Problem Overview