Multi Region Architecture and Avoiding Global Coordination

Operating a URL shortener across multiple regions for low latency and high availability requires eliminating global coordination on the write path, which would serialize writes and add cross region round trip times (often 100 to 300 milliseconds). The dominant pattern is to allocate disjoint ID ranges or embed region identifiers directly in the token, allowing each region to generate unique tokens independently. For example, a counter based system might assign Region US East the numeric range 0 to 999 billion, Region EU West 1 trillion to 1.999 trillion, and Region Asia Pacific 2 trillion to 2.999 trillion. Each region increments its counter locally and Base62 encodes, with zero coordination required across regions during normal operation.

Snowflake style IDs achieve the same goal by composing tokens from timestamp bits, shard or region ID bits, and a per shard counter. A typical layout might use 41 bits for milliseconds since epoch, 10 bits for region and shard ID (supporting 1,024 shards), and 12 bits for sequence number (4,096 IDs per millisecond per shard), Base62 encoded into a short token. This allows each region to generate millions of tokens per second without any cross region communication, but requires disciplined clock synchronization via NTP or PTP to prevent clock skew from causing ID collisions or out of order timestamps.

The redirect path operates active active across all regions: users are routed via DNS or Anycast to the nearest region, which serves redirects from its local cache and database replicas. Newly created mappings must be replicated asynchronously from the write region to all other regions, typically within seconds to low tens of seconds depending on replication lag. Read your writes consistency becomes eventual: a user in Asia who creates a short URL might be routed to the US for the redirect before replication completes, resulting in a not found error. Mitigation strategies include sticky routing (routing users to the region where they wrote for a short time window), write forwarding (local region forwards writes to a designated home region and waits for replication), or caching new writes locally immediately. Systems should monitor replication lag closely and alert if it exceeds acceptable thresholds (e.g., 10 seconds P99), as lag causes user visible inconsistencies and can compound during network partitions or region failures.

💡 Key Takeaways

•Disjoint ID range allocation per region enables writes with zero cross region coordination, avoiding 100 to 300 millisecond round trip latencies that would serialize the write path

•Snowflake IDs with 41 bits timestamp, 10 bits region/shard ID, and 12 bits sequence support 4,096 IDs per millisecond per shard (millions per second per region) without coordination

•Active active redirect tiers route users to nearest region via Anycast; asynchronous replication introduces eventual consistency with typical lag of seconds to low tens of seconds

•Read your writes consistency is eventual in multi region deployments: a user creating a URL in Asia might see not found if redirected through a US region before replication completes

•Monitor replication lag P99 and alert if exceeding thresholds (e.g., 10 seconds); high lag causes user visible errors and can compound during network partitions or region failures

•Clock skew in time based ID schemes can cause duplicate or out of order IDs; use NTP or PTP to keep clocks synchronized within milliseconds across regions

📌 Examples

Google's g.co uses regional ID ranges: US data centers generate tokens starting with certain Base62 prefixes, EU data centers use different prefixes, eliminating write coordination while global Anycast routes redirects to nearest cache

A Snowflake style system at Uber might encode 41 bits of timestamp (69 years of milliseconds), 10 bits for data center and shard (1,024 total), and 12 bits for sequence, yielding 4 million IDs per second per data center

During a network partition between US and EU regions, replication lag might spike from 5 seconds to 120 seconds; users creating URLs in EU and refreshing the page immediately (routed to US) see 404 errors until replication catches up

← Back to URL Shortener Design Overview