What is Global Load Balancing?

Definition
Global Load Balancing (GLB) distributes user traffic across multiple geographic regions or data centers to minimize latency, maximize availability, and efficiently use capacity. Unlike local load balancers that distribute requests within a single data center, GLB operates at planetary scale, making routing decisions based on where users are located and which regions are healthy.
The Three-Layer Architecture
GLB architecture typically involves three layers. At the top sits a global traffic router using either DNS-based GSLB (Global Server Load Balancing) or Anycast edge proxies. Below that, regional load balancers distribute traffic within each geographic region. At the bottom, application backends serve actual requests. The routing engine considers multiple signals: user geography, measured network latency, real-time regional health and capacity, data residency requirements, and operational costs. This layered approach separates global routing decisions from regional load distribution.
The Physics Constraint
The hard constraint GLB must respect is physics. Light travels through fiber at roughly 200,000 km/s, creating a floor for cross-continental latency. US East to US West: 60-80ms RTT. US East to Europe: 70-100ms. US to India: 200-300ms. US to Australia: 180-250ms. No amount of clever routing can overcome these limits for synchronous operations. GLB goal is to route users to the nearest region that can serve them, minimizing this unavoidable latency.
Why Single Region Fails
Without GLB, all users route to a single data center. Users in Asia accessing a US-based service experience 200-300ms latency on every request before any processing begins. Multiply this by the number of round trips in a page load (DNS, TCP handshake, TLS, requests) and user experience degrades significantly. Additionally, the single region becomes a single point of failure: one regional outage takes down the entire service globally.
GLB Value Proposition
GLB provides three core benefits: latency reduction by routing users to nearby regions (saving 100-200ms for cross-continental users), availability improvement by failing over to healthy regions when one fails, and capacity efficiency by distributing load across regions rather than over-provisioning a single location. For a global service with users on multiple continents, GLB is not optional; it is fundamental architecture.
Key Insight: GLB does not make requests faster than physics allows. It ensures users reach the nearest region that can serve them, minimizing the unavoidable latency imposed by the speed of light. The goal is optimal routing, not magic speed improvements.

💡 Key Takeaways

✓GLB routes users across geographic regions based on location, latency, health, and capacity; operates at planetary scale

✓Three-layer architecture: global router (DNS/Anycast), regional load balancers, application backends

✓Physics constraint: US-Europe 70-100ms, US-India 200-300ms RTT; GLB minimizes but cannot eliminate cross-continental latency

✓Three benefits: latency reduction (100-200ms for cross-continental), availability via regional failover, capacity distribution

📌 Interview Tips

1Explain physics constraint: light in fiber at 200K km/s creates 60-80ms US East-West, 200-300ms US-India floor

2Contrast single-region failure: all users hit US data center, Asian users experience 200-300ms per request

3Describe three-layer architecture: global router selects region, regional LB selects instance, backend serves request

← Back to Global Load Balancing Overview