Bulkhead Boundaries: Per-Service, Per-Endpoint, and Per-Tenant Isolation
Per-Service Bulkheads
The most common pattern: one resource pool per downstream service. Payment service gets one pool, inventory service gets another, user service gets a third. Simple to implement and reason about. Works well when services have similar latency characteristics and failure modes.
Per-Endpoint Bulkheads
Finer grained: separate pools for different endpoints on the same service. The payment service /charge endpoint is critical; /refund is less time sensitive. Give charge its own pool so refund slowness cannot affect charging. Useful when one service has endpoints with vastly different latency or criticality.
Per-Tenant Bulkheads
In multi tenant systems, isolate tenants from each other. One tenant running expensive queries cannot exhaust resources for others. Implement via separate thread pools per tenant, or semaphores limiting concurrent requests per tenant. Essential for SaaS platforms where tenant behavior varies wildly.
Hierarchical Bulkheads
Combine levels: global limit of 200 threads, per-service limits of 50 each, per-tenant limits of 10 each. Requests must pass all levels. This prevents any single dimension from dominating: neither one service nor one tenant can exhaust the system.
Choosing Boundaries
Start with per-service bulkheads. Add per-endpoint when you observe one endpoint affecting others on the same service. Add per-tenant when noisy neighbor complaints arise. More boundaries mean more configuration complexity; add them based on observed problems rather than speculatively.