Load Balancing • L4 vs L7 Load BalancingEasy⏱️ ~3 min
What is Layer 7 (L7) Load Balancing?
Layer 7 load balancing operates at the application layer, requiring full parsing of protocol semantics like HTTP request lines, headers, cookies, and Remote Procedure Call (RPC) methods. L7 load balancers are almost always full proxies that terminate Transport Layer Security (TLS) to inspect content, then optionally re encrypt connections to upstream servers. This deep inspection enables sophisticated features impossible at L4: content based routing by host/path/header, rate limiting, retries with jittered backoff, request rewriting, compression, caching, and security controls including Web Application Firewall (WAF) and bot filtering.
The cost of this intelligence is additional latency and CPU consumption. Typical L7 load balancers add 0.5 to 3 milliseconds per request for parsing, policy evaluation, and proxying, with peer to peer 99th percentile (p99) latency dependent on buffering and policy complexity. TLS termination at 1 to 5 Gbps per core is common on modern CPUs using Elliptic Curve Diffie Hellman Ephemeral/Advanced Encryption Standard Galois/Counter Mode (ECDHE/AES GCM). Service mesh sidecars like Envoy add approximately 0.3 to 1.5 milliseconds p50 overhead per hop for HTTP/1.1 to HTTP/2 translation, routing, retries, and metrics collection.
AWS Application Load Balancer (ALB) performs host and path routing with TLS termination, adding low single millisecond latency. Azure Front Door provides global anycast L7 routing with sub 2 millisecond p50 added latency intra region. These proxies manage separate client and upstream connection pools, enabling HTTP/2 or HTTP/3 multiplexing and backpressure management that improves backend efficiency by reducing connection counts by an order of magnitude.
L7 load balancing excels in microservices architectures where content based routing, zero downtime deployments through canary releases (1 to 10 percent traffic splits), authentication, and request aware resiliency are critical. The tradeoff is increased complexity and new failure modes including retry storms, head of line blocking in multiplexed protocols, and TLS certificate management issues that can trigger global outages.
💡 Key Takeaways
•Full proxy that terminates TLS and parses application protocols (HTTP, gRPC) to enable content based routing by host, path, headers, or cookies
•Adds 0.5 to 3 milliseconds typical request latency for parsing and policy; service mesh sidecars add 0.3 to 1.5 milliseconds p50 per hop
•TLS termination at 1 to 5 Gbps per CPU core; AWS ALB and Azure Front Door add sub 2 millisecond p50 latency intra region
•Enables zero downtime deployments via canary releases (1 to 10 percent traffic splits), retries with jittered backoff, circuit breaking, and outlier detection
•Reduces backend connection counts by an order of magnitude through connection pooling and HTTP/2 multiplexing
•New failure modes include retry storms, head of line blocking in multiplexed streams, TLS certificate expiration causing global outages, and buffering memory spikes
📌 Examples
AWS Application Load Balancer performs host and path routing with TLS termination, adding low single millisecond latency with backend re encryption optional
Envoy sidecar proxies in service meshes add 0.3 to 1.5 ms p50 overhead for HTTP/1.1 to HTTP/2 translation, routing, retries, and metrics on typical hardware
Azure Front Door provides global L7 HTTP routing at edge with WAF, TLS termination, and sub 2 ms p50 added latency intra region