Handling Scale: WebSockets and Connection Management
Connection Management at Scale
Each tracking device maintains persistent connection. 1 million devices means 1 million concurrent connections. Standard HTTP servers handle maybe 10,000 connections per instance. Need 100+ servers just for connections, before any processing.
WebSocket servers optimize for many idle connections. Event-driven architecture (Node.js, Go) handles 100,000+ connections per server. Use connection pooling and multiplexing where possible. Monitor connection count per server and memory usage.
WebSocket vs Alternatives
WebSocket: Persistent bidirectional connection. Good for high-frequency updates. Requires sticky load balancing or connection-aware routing. Stateful servers complicate scaling.
Server-Sent Events (SSE): Server pushes to client. Simpler than WebSocket. Good when server needs to push but client updates are infrequent. Uses standard HTTP, easier load balancing.
Polling: Client requests at intervals. Stateless, simple scaling. But high overhead for frequent updates. Acceptable for low-frequency tracking like package delivery.
Handling Disconnects
Mobile connections drop frequently: tunnels, poor signal, battery saver. Design for disconnection as normal state. Store last known position with timestamp. Mark stale positions in query results. Reconnect with exponential backoff to avoid thundering herd.
Detect ghost connections: device disconnected but server did not notice. Heartbeat every 30 seconds. No heartbeat for 2 minutes means disconnected. Clean up stale connections to free resources.