Cascading Timeouts: Coordinating Limits Across Service Layers
The Timeout Ordering Problem
Service A calls B, which calls C. If A timeout is 5s, B timeout is 10s, and C timeout is 15s, the outer caller (A) gives up first. B and C continue working on a request that A has already abandoned. This wastes resources and can cause inconsistencies if side effects complete after caller gives up.
Correct Timeout Ordering
Inner services should timeout before outer services: C timeout < B timeout < A timeout. If C is 3s, B is 5s, A is 8s, failures propagate correctly. C fails first, B receives the error and can handle it, A receives B response while still waiting. No wasted work on abandoned requests.
Budget Based Timeouts
Instead of fixed timeouts, propagate a deadline. A starts with 10s budget. After 2s of processing, A passes 8s remaining to B. B uses 1s, passes 7s remaining to C. Each service knows exactly how much time it has. This automatically coordinates timeouts without manual tuning per layer.
Deadline Propagation
Pass absolute deadlines rather than relative timeouts. A sets deadline of now + 10s in a header. Each downstream service checks: is deadline passed? If yes, fail immediately. If no, proceed but respect the deadline. This handles clock skew better than relative budgets and is clearer to debug.
Implementing Budget Headers
Common patterns: gRPC uses grpc-timeout header automatically. HTTP services can use custom headers like X-Request-Deadline or X-Timeout-Budget-Ms. Middleware extracts the header, sets local timeout, and propagates reduced budget to downstream calls.