DeadlocksLivelock & StarvationHard⏱️ ~2 min

Preventing Livelock

Core Concept
Preventing livelock requires breaking symmetry. If all threads use identical logic, they will dance forever. Introduce randomness, ordering, or asymmetric behavior to let one thread win.

Randomized Backoff

The classic solution to livelock is exponential backoff with jitter. When threads conflict, each waits a random amount of time before retrying. The randomness breaks synchronization.

Without jitter: Thread A waits 100ms, Thread B waits 100ms. They retry at the same time. Conflict again.

With jitter: Thread A waits 80-120ms (random). Thread B waits 80-120ms (random). Different wait times break the dance.

Asymmetric Conflict Resolution

Give threads different roles or priorities in conflict resolution. The Ethernet CSMA/CD protocol does this: after collision, each station picks a random slot from an exponentially growing range.

Asymmetric rules: One thread always backs off, the other always proceeds. Or use thread ID: lower ID wins conflicts.

Lock ordering: If Thread A holds Lock1 and needs Lock2, but Thread B holds Lock2 and needs Lock1, use consistent ordering. Both always acquire Lock1 first, then Lock2.

Limit Retry Attempts

Set a maximum number of retries. After N attempts, escalate: acquire a global lock, notify an arbiter, or fail with an error. This converts potential livelock into bounded delay or a clear failure.

Practical pattern: Try 3 times with random backoff. On 4th attempt, acquire a coarse-grained lock that serializes conflicting operations. Slower but guaranteed progress.

Warning: Livelock prevention often trades throughput for progress guarantees. Randomness and fallback locks add latency. Accept this trade-off consciously.
💡 Key Takeaways
Break symmetry: identical logic causes identical behavior - threads dance forever
Randomized backoff with jitter: different random delays desynchronize conflicting threads
Asymmetric resolution: use thread ID, priority, or role to decide who wins conflicts
Limit retries: after N attempts, escalate to coarse lock or fail - converts livelock to bounded delay
Ethernet CSMA/CD: classic example of exponential backoff with random slots
📌 Examples
1AWS SDK retries: exponential backoff with full jitter (random 0 to calculated delay)
2Database transactions: after N conflicts, escalate to table-level lock
3TCP congestion: additive increase, multiplicative decrease breaks synchronization
← Back to Livelock & Starvation Overview
Preventing Livelock | Livelock & Starvation - System Overflow