Resilience & Service Patterns • API Gateway PatternsEasy⏱️ ~3 min
What is an API Gateway and Why Use One?
An API Gateway is a layer 7 reverse proxy that serves as the single entry point for all client traffic entering your microservices architecture. It sits at the edge handling north-south traffic (clients to services) and centralizes cross-cutting concerns that would otherwise need to be duplicated across every backend service.
The gateway solves several practical problems at scale. First, it dramatically reduces client chattiness by aggregating multiple backend calls into a single request, which is critical for mobile clients where each round trip consumes battery and incurs 100 to 300 millisecond radio wakeups. Second, it shields clients from backend topology changes. When you need to split a monolith, version an API, or migrate protocols, clients remain unaffected. Third, it centralizes authentication, rate limiting, TLS termination, and logging in one place instead of implementing these in 50 different services.
The tradeoff is adding another network hop and potential single point of failure. Netflix's Zuul 2 processes over 50,000 requests per second per cluster, targeting single digit millisecond overhead at the 50th percentile. AWS API Gateway enforces regional limits around 10,000 requests per second by default with typical proxy overhead under 5 milliseconds. The key is keeping the gateway thin on business logic and thick on policy.
A common misconception is confusing an API Gateway with a service mesh. The gateway handles external client traffic and client concerns like aggregation and protocol translation. A service mesh like Istio handles internal service to service communication. They complement each other rather than overlap.
💡 Key Takeaways
•Centralizes authentication, rate limiting, TLS termination, and request validation to avoid duplicating these in every microservice
•Reduces mobile round trips by aggregating 3 to 8 backend calls into one request, critical for 300 millisecond cellular latency budgets
•Netflix Zuul 2 handles 50,000+ requests per second per cluster with event driven I/O and single digit millisecond latency targets
•AWS API Gateway regional defaults around 10,000 requests per second with proxy integrations adding under 5 milliseconds overhead
•Adds extra network hop (typically 1 to 5 milliseconds in region) and creates potential single point of failure requiring multi availability zone deployment
•Distinct from service mesh: gateway handles north-south external traffic while mesh handles east-west internal service to service communication
📌 Examples
Mobile app makes one gateway call that aggregates user profile, order history, and recommendations instead of three separate round trips saving 200+ milliseconds
E-commerce site uses gateway to enforce 100 requests per second per API key with 300 token burst capacity using token bucket algorithm
Streaming service runs gateway in multiple regions with health aware DNS, each cluster auto scaling from 10 to 200 instances based on request rate