Circuit Breaker Pattern: Fail Fast to Preserve System Health

Definition
Circuit Breaker is a stability pattern that monitors failures to a downstream service and automatically stops sending requests when failure rates exceed thresholds, preventing cascade failures across the system.
The Cascade Failure Problem
When Service A calls Service B, and B becomes slow or unresponsive, A accumulates waiting threads. Each thread holds memory, connection pool slots, and database connections. With 1,000 requests/second and 30 second timeouts, A quickly exhausts its thread pool. Now services calling A also start timing out. Within minutes, the entire system becomes unresponsive because of one failing service.
Why Retries Make It Worse
The instinct during failures is to retry. If 10% of requests fail and each retry adds 3 additional attempts, traffic to the failing service increases by 30%. When the service is already struggling under load, this additional traffic pushes it further into failure. The service cannot recover because it never gets breathing room.
The Circuit Breaker Solution
A circuit breaker sits between caller and callee, tracking success and failure rates. When failures cross a threshold like 50% over 10 seconds, it opens and immediately rejects all requests without calling the downstream service. This fail fast behavior returns errors in 1ms instead of waiting for timeouts, freeing resources instantly.
💡 Key Insight: Circuit breakers protect the caller, not the callee. The failing service gets breathing room to recover, but the primary benefit is preventing resource exhaustion in the calling service.
Three States
Closed: Normal operation. All requests pass through. The breaker counts failures. Open: Failure threshold exceeded. All requests fail immediately without attempting downstream call. Half Open: After a cooldown period, a limited number of test requests pass through. If they succeed, the breaker closes. If they fail, it reopens.

💡 Key Takeaways

✓Circuit breakers fail fast (1ms) instead of waiting for timeouts, preventing thread pool exhaustion in calling services

✓Three states: Closed (normal), Open (rejecting), Half Open (testing recovery) control request flow

✓Retries during failures amplify load by 30% or more, making recovery impossible without circuit breakers

📌 Interview Tips

1Draw the three state diagram and explain what triggers each transition during your design walkthrough

2When asked about resilience, start with the cascade failure scenario: one slow service taking down the entire system

3Mention specific numbers: 50% failure threshold, 10 second window, 30 second cooldown as starting configurations

← Back to Circuit Breaker Pattern Overview