Consistent Hashing for Cache Locality and Stickiness
The Problem Consistent Hashing Solves
Consistent hashing solves a critical problem in distributed caching and stateful routing: how do you map keys to servers such that adding or removing a server remaps only a minimal fraction of keys? With naive modulo hashing (hash(key) mod N), changing N from 10 to 11 servers remaps roughly 90% of keys, causing massive cache misses and state migration. Consistent hashing remaps only ~1/N of keys when adding one server to N existing servers.
How the Hash Ring Works
The algorithm maps both servers and keys onto a circular hash space (typically 0 to 2^32). Each server hashes to a position on the ring. To route a key, hash it to a position and walk clockwise until you hit the first server. When you add a new server, only keys that fall between it and the previous server clockwise get remapped. When you remove a server, its keys move to the next server clockwise. This locality property is what makes consistent hashing valuable: changes affect only neighboring keys, not the entire keyspace.
Virtual Nodes for Even Distribution
With only physical server positions, load distribution can be uneven. If servers hash to positions that are not uniformly spaced, some servers own larger arc segments and receive disproportionate traffic. Virtual nodes fix this: each physical server maps to 100-200 positions on the ring by hashing concatenations like "server1_0", "server1_1" through "server1_199". This smooths load across servers and bounds imbalance to roughly ±5-10%. More virtual nodes improve distribution but increase memory for the ring lookup table.
Hotspot Vulnerability
The key production trade-off is hotspot vulnerability. If one key (a celebrity profile, viral video, or popular API endpoint) dominates traffic, it pins to a single backend regardless of load. Unlike dynamic algorithms that detect overload and shed traffic, consistent hashing deterministically routes that key to the same server. Additionally, IP-based consistent hashing breaks when thousands of users behind carrier-grade NAT (CGNAT) collapse to 2-5 public IP addresses. Example: 100,000 RPS from a mobile carrier maps to 2 IPs, overloading 2 backends with 50,000 RPS each while others idle.
Mitigation Strategies
Bounded load hashing: Each key maps to top k servers on the ring; route to least loaded among them. Preserves locality while avoiding hotspots. Layer 7 cookie affinity: Instead of IP hash, use user ID or session cookie, distributing CGNAT users across backends. Rendezvous hashing: Alternative to ring-based; compute hash for each (key, server) pair, pick server with highest hash. Simpler, no virtual nodes needed, but O(N) per lookup versus O(log N) for ring with binary search.