Resilience & Service PatternsAPI Gateway PatternsMedium⏱️ ~3 min

Gateway Aggregation: Scatter Gather Pattern

Gateway aggregation (also called API composition or scatter gather) executes multiple backend requests in parallel and synthesizes a single unified response to the client. This pattern dramatically reduces mobile round trips and radio wakeups but increases edge fan out and requires careful timeout and partial failure handling. Consider a mobile app home screen that needs user profile, recent orders, and personalized recommendations. Without aggregation, that is three sequential round trips of 100 to 150 milliseconds each over cellular, totaling 300 to 450 milliseconds plus render time. With aggregation, the gateway fires all three requests in parallel with individual 50 millisecond timeouts and a 100 millisecond total budget. If recommendations are slow or fail, the gateway returns the response with that section omitted or using cached fallback data. The client sees 100 to 120 milliseconds total instead of 450 milliseconds. The failure mode is fan out amplification. Aggregating N services means the probability of hitting at least one slow tail increases. The 99th percentile latency of an aggregate is roughly the maximum of N backend p99 values. If each service has 10 millisecond p99, aggregating 5 services gives you closer to 50 millisecond aggregate p99. Mitigation strategies include per upstream circuit breakers (open after 50 percent failures over 20 requests), hedged requests for tail heavy dependencies (send duplicate request after 50 milliseconds if first hasn't returned), and always supporting partial responses. A critical anti pattern is putting too much business logic and orchestration in the gateway. The edge should do shallow aggregation (fetch and merge) not deep workflow coordination. Complex orchestration belongs in a dedicated backend orchestrator service with proper testing, versioning, and domain ownership. Large media and commerce companies run mobile BFFs that aggregate 3 to 8 backend calls per request to fit within 200 to 300 millisecond p95 budgets, keeping gateway compute time under 5 to 10 milliseconds.
💡 Key Takeaways
Reduces mobile round trips from 300 to 450 milliseconds (3 sequential calls) to 100 to 120 milliseconds (parallel aggregation) by eliminating radio wakeup penalties
Fan out amplification problem: aggregate p99 latency roughly equals maximum of N backend p99 values, so 5 services at 10ms p99 each yields 50ms aggregate p99
Circuit breakers per upstream open after 50 percent failure rate over 20 requests with exponential backoff cool down period to prevent cascading failures
Always support partial responses with missing sections flagged: never block entire response on optional data like recommendations or social features
Hedged requests send duplicate to backup instance after 50 milliseconds if primary hasn't returned, critical for tail heavy dependencies
Anti pattern: putting business orchestration logic at edge makes testing and versioning brittle; keep aggregation shallow (fetch and merge only)
📌 Examples
Mobile BFF aggregates user profile, order history (last 10), and personalized recommendations in parallel with 100ms total timeout returning partial response if recommendations fail
E-commerce product page gateway fetches item details (required, 30ms timeout), inventory (required, 30ms), reviews (optional, 50ms), and similar items (optional, 50ms) merging all into one payload
Streaming app home screen aggregates continue watching (cache backed, 20ms), trending (cached hourly, 30ms), and personalized rows (ML service, 80ms with stale fallback)
← Back to API Gateway Patterns Overview
Gateway Aggregation: Scatter Gather Pattern | API Gateway Patterns - System Overflow