What is Layer 4 (L4) Load Balancing?
tens to hundreds of microseconds per packet.Why Operating Modes Matter
When an L4 load balancer receives a packet, it must answer a fundamental question: how should it get this packet to the backend server? The answer defines the load balancer's operating mode. This choice cascades through every aspect of the system: memory consumption, maximum throughput, failure handling, and network topology requirements. Understanding these modes is essential because the wrong choice can leave gigabits of capacity unused or create bottlenecks that collapse under load.
Full Proxy Mode (NAT)
In full proxy mode, the load balancer terminates the client connection entirely and opens a separate connection to the backend. It maintains a NAT (Network Address Translation) table tracking every active flow, consuming 200-500 bytes of state per TCP connection. A load balancer with 2GB memory can track millions of simultaneous flows, but when this table fills, new connections are dropped. The critical architectural constraint: all traffic, both inbound requests and outbound responses, flows through the load balancer. This means your load balancer's egress bandwidth caps your total system throughput.
Direct Server Return Mode
DSR (Direct Server Return) solves the egress bottleneck by changing the return path: the load balancer handles only inbound packets while backend servers reply directly to clients, bypassing the load balancer entirely. This architectural shift dramatically increases throughput. Modern kernel-bypass implementations using technologies like DPDK or XDP (which process packets directly in the kernel or network card, avoiding the overhead of the full network stack) achieve 10-40 Gbps per load balancer because they never process response traffic. The trade-off is complexity: servers must be configured with the VIP (Virtual IP, the public address clients connect to) on a special non-routed interface so they accept packets destined for that IP but don't advertise it to the network.
Performance Characteristics
Production L4 load balancers handle millions of new connections per second with sub-millisecond added latency. They excel for non-HTTP protocols (gaming servers, real-time video, DNS), scenarios requiring absolute minimum latency, or extreme packets-per-second workloads where application layer inspection would waste CPU cycles. The limitation: L4 cannot route based on URL paths, HTTP headers, or cookies because it never reads application data. Health checks are limited to TCP handshake success or TLS negotiation, which only confirms the port is open, not that the application is functioning correctly.