Load Balancing • Sticky SessionsMedium⏱️ ~2 min
Production Implementation Patterns: Cookie Based vs IP Based Affinity
There are two dominant mechanisms for implementing sticky sessions in production: cookie based affinity and IP based affinity. Cookie based affinity is the standard for Layer 7 (L7) HTTP/HTTPS load balancers. The load balancer injects a cookie on the first response that encodes the selected backend identifier, an issued timestamp, expiry time, and a cryptographic signature to prevent tampering. Azure Application Gateway and AWS Application Load Balancer both follow this pattern. The cookie travels with every subsequent request, allowing the load balancer to route deterministically without maintaining state in memory.
IP based affinity is typical for Layer 4 (L4) network load balancers that operate below the HTTP layer. The load balancer hashes the client's source IP address (and optionally port) to select a backend, then remembers this mapping for a configured duration. Google Cloud network load balancers and AWS Network Load Balancers support this mode. The advantage is protocol independence: it works for TCP, UDP, and any IP traffic. The severe limitation is that many clients share the same IP address behind carrier grade Network Address Translation (NAT) or corporate proxies, causing thousands of users to pin to one backend and creating massive hotspots.
Cookie based affinity is generally superior for web applications because it survives IP address changes (mobile devices switching between WiFi and cellular), provides per user granularity instead of per IP, and allows the load balancer to remain stateless by encoding routing information in the cookie itself. The security requirement is strict: cookies must be signed with a rotated secret key, marked HTTP only and secure to prevent JavaScript access and transmission over unencrypted channels, and regenerated on authentication events to prevent session fixation attacks.
The failure mode both share is that affinity breaks immediately on backend health check failures. When a server becomes unhealthy, the load balancer must reassign clients to a new backend, losing all session state unless the application has checkpointed it to a shared store. In practice, this means maintaining a hybrid model: critical state writes go to a centralized cache or database within 2 milliseconds, while hot read caches remain local to avoid the latency penalty on every request.
💡 Key Takeaways
•Cookie based affinity provides per user granularity and survives IP changes, making it the default for web applications on L7 load balancers like AWS ALB and Azure Application Gateway
•IP based affinity works at L4 for any TCP/UDP protocol but suffers from carrier grade NAT where thousands of users share one IP, creating load hotspots with imbalance ratios above 3 to 5 times mean
•Cookies must be cryptographically signed with rotated keys, marked HTTP only and secure, and regenerated on authentication to prevent fixation and hijacking attacks
•Both mechanisms break affinity immediately on backend health failures, requiring applications to checkpoint critical state to a shared store within 2 millisecond budget to prevent data loss
•Affinity time to live (TTL) bounds should be 10 to 30 minutes to limit skew during scale events and deployment drains, with shorter windows before planned maintenance
📌 Examples
AWS Application Load Balancer cookie format: AWSALB contains base64 encoded target ID, timestamp, and HMAC signature validated on each request
Microsoft Azure Front Door implements cookie based affinity with configurable TTL; cross region failover intentionally breaks affinity for availability over state preservation
IP based affinity on mobile carrier networks can pin 50,000 users behind one NAT gateway IP to a single backend, exhausting CPU while other instances sit at 20 percent utilization