Bulkhead Failure Handling: Rejection Strategies and Fallbacks
What Happens When Bulkhead Is Full
When a bulkhead pool is exhausted, new requests must be handled. Options: reject immediately, queue briefly, or shed load intelligently. The choice affects user experience and system behavior. Immediate rejection is fast but harsh; queuing adds latency but smooths bursts.
Immediate Rejection
Return error instantly when pool is full. Response time is sub-millisecond regardless of downstream state. Caller can retry, use fallback, or propagate error. Best for latency sensitive paths where waiting is worse than failing. Use HTTP 503 Service Unavailable with Retry-After header.
Bounded Queue
Queue requests up to a limit before rejecting. A queue of 10 with 50ms average processing adds up to 500ms wait time. Smooths brief traffic spikes but can accumulate latency during sustained overload. Set queue timeout shorter than client timeout to avoid wasted work on requests the client has abandoned.
Fallback Responses
Instead of errors, return degraded responses. Cache serves stale data, defaults replace missing values, simplified logic skips non-essential steps. Fallbacks should be pre-computed or very fast; a slow fallback defeats the purpose. Test fallbacks regularly since they rarely execute in normal operation.
Priority-Based Rejection
Not all requests are equal. VIP customers, health checks, and admin operations may need priority access. Implement priority queues or reserved capacity: 40 threads general, 10 threads reserved for priority. Low priority requests are rejected first when approaching capacity.