Geospatial & Location ServicesReal-time Location TrackingHard⏱️ ~3 min

Handling Scale: WebSockets and Connection Management

Maintaining millions of persistent bidirectional connections for real-time location updates requires specialized infrastructure beyond standard HTTP request/response patterns. WebSocket connections enable full duplex communication where servers push location updates to clients without polling, reducing latency from 2 to 5 seconds (with HTTP polling) to under 500 milliseconds. However, each WebSocket connection consumes 4 to 8 kilobytes of kernel memory for TCP buffers, limiting a single server to roughly 50,000 to 100,000 concurrent connections before hitting memory or file descriptor limits. Uber handles 5 million concurrent connections during peak hours by horizontally scaling WebSocket servers and using connection routing based on entity identifiers. When a driver connects, they're assigned to a specific WebSocket server based on consistent hashing of their driver ID, ensuring all updates for that driver route to the same server. This enables server side state management: each WebSocket server maintains an in memory map of connected entities within its shard, avoiding expensive database lookups for every message. With 100 WebSocket servers, each handles 50,000 connections using 400 megabytes of memory for connection state. Connection resilience requires heartbeat mechanisms and exponential backoff reconnection. Clients send ping frames every 30 seconds; if the server doesn't respond within 10 seconds, the client assumes the connection died (due to network issues or server failure) and reconnects. During reconnection storms (when a server crashes and 50,000 clients simultaneously try reconnecting), exponential backoff with jitter prevents thundering herd: clients wait random(1 to 5 seconds), then random(2 to 10 seconds), up to 60 seconds maximum before attempting reconnect. The alternative approach using Server Sent Events (SSE) provides unidirectional server to client streaming over HTTP, which is simpler than WebSockets but requires separate HTTP POSTs for client to server updates (location uploads). SSE works better with existing HTTP infrastructure (proxies, load balancers) and automatically reconnects on connection loss. However, SSE lacks binary protocol support and has 6 connection per domain limit in browsers, making WebSockets preferable for mobile native apps where these limitations don't apply.
💡 Key Takeaways
Memory limits per server: Each WebSocket connection uses 4 to 8 kilobytes of TCP buffer memory limiting single server to 50,000 to 100,000 concurrent connections before exhausting available RAM or file descriptors
Connection sharding at Uber scale: 5 million connections distributed across 100 WebSocket servers using consistent hashing on driver ID, each server maintains in memory map of 50,000 entities requiring only 400 megabytes RAM
Heartbeat and failure detection: Client pings every 30 seconds with 10 second timeout for detecting dead connections, reducing wasted bandwidth on connections that silently failed (mobile switching between WiFi and cellular)
Reconnection storm mitigation: Exponential backoff with jitter prevents thundering herd when server crashes and 50,000 clients simultaneously reconnect, spreading reconnections over 60 to 120 seconds instead of instant stampede
WebSocket versus SSE tradeoff: WebSockets provide true bidirectional binary protocol with full control but require specialized infrastructure, SSE offers simpler unidirectional streaming over HTTP with automatic reconnection but needs separate POSTs for uploads
📌 Examples
WebSocket server connection routing: hash = consistent_hash(driver_id) % num_servers; ws_endpoint = 'wss://location-ws-' + hash + '.uber.com'; This ensures all messages for a driver route to same server enabling stateful message processing
Client side exponential backoff: attempt = 0; function reconnect() { delay = min(60, 2^attempt) * 1000 + random(0, 1000); setTimeout(connectWebSocket, delay); attempt++; } Adds jitter via random milliseconds preventing synchronized reconnection spikes
In memory connection map on WebSocket server: connections = new Map(); ws.on('open', () => connections.set(driverId, ws)); ws.on('message', (location) => { updateRedis(driverId, location); broadcastToNearbyRiders(location); }); Enables O(1) lookups without database queries
← Back to Real-time Location Tracking Overview
Handling Scale: WebSockets and Connection Management | Real-time Location Tracking - System Overflow