What is Global Load Balancing?
The Three-Layer Architecture
GLB architecture typically involves three layers. At the top sits a global traffic router using either DNS-based GSLB (Global Server Load Balancing) or Anycast edge proxies. Below that, regional load balancers distribute traffic within each geographic region. At the bottom, application backends serve actual requests. The routing engine considers multiple signals: user geography, measured network latency, real-time regional health and capacity, data residency requirements, and operational costs. This layered approach separates global routing decisions from regional load distribution.
The Physics Constraint
The hard constraint GLB must respect is physics. Light travels through fiber at roughly 200,000 km/s, creating a floor for cross-continental latency. US East to US West: 60-80ms RTT. US East to Europe: 70-100ms. US to India: 200-300ms. US to Australia: 180-250ms. No amount of clever routing can overcome these limits for synchronous operations. GLB goal is to route users to the nearest region that can serve them, minimizing this unavoidable latency.
Why Single Region Fails
Without GLB, all users route to a single data center. Users in Asia accessing a US-based service experience 200-300ms latency on every request before any processing begins. Multiply this by the number of round trips in a page load (DNS, TCP handshake, TLS, requests) and user experience degrades significantly. Additionally, the single region becomes a single point of failure: one regional outage takes down the entire service globally.
GLB Value Proposition
GLB provides three core benefits: latency reduction by routing users to nearby regions (saving 100-200ms for cross-continental users), availability improvement by failing over to healthy regions when one fails, and capacity efficiency by distributing load across regions rather than over-provisioning a single location. For a global service with users on multiple continents, GLB is not optional; it is fundamental architecture.