Bulkhead Failure Handling: Rejection Strategies and Fallbacks

What Happens When Bulkhead Is Full
When a bulkhead pool is exhausted, new requests must be handled. Options: reject immediately, queue briefly, or shed load intelligently. The choice affects user experience and system behavior. Immediate rejection is fast but harsh; queuing adds latency but smooths bursts.
Immediate Rejection
Return error instantly when pool is full. Response time is sub-millisecond regardless of downstream state. Caller can retry, use fallback, or propagate error. Best for latency sensitive paths where waiting is worse than failing. Use HTTP 503 Service Unavailable with Retry-After header.
Bounded Queue
Queue requests up to a limit before rejecting. A queue of 10 with 50ms average processing adds up to 500ms wait time. Smooths brief traffic spikes but can accumulate latency during sustained overload. Set queue timeout shorter than client timeout to avoid wasted work on requests the client has abandoned.
Fallback Responses
Instead of errors, return degraded responses. Cache serves stale data, defaults replace missing values, simplified logic skips non-essential steps. Fallbacks should be pre-computed or very fast; a slow fallback defeats the purpose. Test fallbacks regularly since they rarely execute in normal operation.
✅ Best Practice: Combine strategies: try primary path, on bulkhead rejection try fallback, on fallback failure return error. Each layer provides a chance to serve the user before giving up.
Priority-Based Rejection
Not all requests are equal. VIP customers, health checks, and admin operations may need priority access. Implement priority queues or reserved capacity: 40 threads general, 10 threads reserved for priority. Low priority requests are rejected first when approaching capacity.

💡 Key Takeaways

✓Immediate rejection: sub-millisecond response, best for latency sensitive paths. Use 503 with Retry-After.

✓Bounded queues smooth bursts but add latency. Set queue timeout shorter than client timeout.

✓Fallbacks: return degraded responses (cached data, defaults) instead of errors. Test fallbacks regularly.

📌 Interview Tips

1Calculate queue latency: queue of 10 × 50ms processing = up to 500ms additional wait time

2Describe layered approach: try primary, on rejection try fallback, on fallback failure return error

3Mention priority reservation: 40 general threads + 10 priority ensures VIP access during overload

← Back to Bulkhead Pattern Overview