Load BalancingLoad Balancing AlgorithmsEasy⏱️ ~2 min

Static vs Dynamic Load Balancing Algorithms

Load balancing algorithms fall into two fundamental categories that make routing decisions in very different ways. Static algorithms like round robin, weighted round robin, and hashing make decisions without live feedback from backend servers. They follow predetermined rules: round robin cycles through servers sequentially, hashing maps requests to servers based on IP address or URL, and weighted variants distribute proportionally to assigned capacities. These require minimal coordination and add negligible overhead, making them ideal when servers are homogeneous and request processing times are uniform. Dynamic algorithms like least connections, least requests, and least response time react to current server state using live metrics. They track queue lengths, in flight requests, or recent latency to avoid sending traffic to overloaded servers. The tradeoff is complexity: you need accurate, timely metrics and more control plane communication. Dynamic algorithms shine with variable workloads and heterogeneous fleets where some servers are slower due to different CPU generations, cold caches, or noisy neighbors. The choice dramatically affects tail latency under real conditions. Consider a 10 instance cluster handling 15,000 requests per second (RPS) where nine instances support 2,000 RPS but one degraded instance only handles 1,000 RPS. Round robin sends 1,500 RPS to each server. The degraded instance's queue grows at 500 RPS, quickly causing timeouts and pushing p99 latency beyond 10 seconds, even though total cluster capacity of 19,000 RPS exceeds demand. Least requests would detect the growing queue and shift load away, preventing the cascade. AWS Application Load Balancer (ALB) offers both paradigms: round robin for simple workloads and least outstanding requests for variable traffic. Google Maglev uses flow consistent hashing at Layer 4 (L4) for per flow stickiness, enabling any load balancer instance to handle any flow without centralized state while maintaining sub second failover.
💡 Key Takeaways
Static algorithms (round robin, hashing) use zero live feedback and add negligible latency overhead, ideal for homogeneous servers with uniform request times
Dynamic algorithms (least connections, least requests) track server state to avoid overloaded instances, reducing tail latency by 40 to 60% in heterogeneous fleets but requiring fresh metrics
Real production failure: Round robin with one degraded server (1,000 RPS capacity vs 2,000 RPS on others) receiving 1,500 RPS causes queue growth at 500 RPS and p99 latency exceeding 10 seconds
AWS ALB supports both round robin and least outstanding requests, letting teams choose based on workload variability and server heterogeneity
Google Maglev achieves millions of packets per second per node using flow consistent hashing at L4, enabling sub second failover without centralized coordination
📌 Examples
AWS Application Load Balancer defaults to round robin for simple web apps but switches to least outstanding requests for variable API workloads with long tail latencies
Google Maglev uses consistent hashing over 5 tuple flows (source IP, dest IP, source port, dest port, protocol) so any Maglev instance can handle any flow, achieving software load balancing at 10 to 40 Gbps per node
Azure Front Door combines anycast routing to nearest point of presence (POP) with latency based backend selection, reducing p95 latency by 20 to 40% for users far from origin regions
← Back to Load Balancing Algorithms Overview