Consistent Hashing Variants: Ring vs Jump vs Rendezvous vs Maglev

Ring based consistent hashing is flexible but not the only approach. Jump consistent hash offers near perfect balance with zero memory overhead: given a key and node count N, it computes a deterministic bucket index in O(ln N) arithmetic operations with no data structures. The catch: you can only use integer bucket indices (no arbitrary node IDs) and removing nodes requires renumbering or an indirection layer. Google uses jump hash internally where append only growth is acceptable.

Rendezvous hashing (Highest Random Weight) computes a score for each node given a key, then picks the node with the highest score. The naive implementation is O(N) per lookup, testing every node, but this is acceptable when N is small (tens of nodes) or when you can prefilter candidates. The advantage is natural replication: pick the top K nodes for K replicas, automatically diverse without ring adjacency concerns. Weighted capacity is trivial: multiply the score by a weight function.

Multiprobe consistent hashing hashes the key multiple times and picks the closest node across all probes. With approximately 21 probes, you achieve roughly 1.05 peak to mean load ratio with only O(1) memory per node (one entry per physical node). This is a sweet spot for systems with hundreds to thousands of nodes where ring metadata becomes costly but O(N) rendezvous is too slow. Maglev, used by Google Cloud Load Balancing, precomputes a lookup table (a permutation of backends) for O(1) per packet consistent hashing at millions of packets per second. The tradeoff: table regeneration on membership change is expensive and caps backend count, but lookups are extremely fast with excellent cache locality.

Choose based on your constraints. Ring hashing for flexibility and operational simplicity in storage systems. Jump hash for internal shard selection where you control IDs and can tolerate append only growth. Rendezvous for small to medium N with simple replication. Multiprobe for large N with tight memory budgets. Maglev for ultra low latency load balancers where membership changes are infrequent relative to packet rates.

💡 Key Takeaways

✓Jump consistent hash: Zero memory, near perfect balance, but limited to integer bucket IDs and append only growth; used by Google internally where indirection layer maps buckets to nodes

✓Rendezvous hashing: O(N) per lookup acceptable for small N (tens of nodes); naturally supports top K replication without ring adjacency, trivial weighted capacity via score multiplication

✓Multiprobe with 21 probes: Achieves approximately 1.05 peak to mean load with only one entry per node (versus thousands with vnodes), ideal for hundreds to thousands of nodes

✓Maglev table based: Google Cloud Load Balancing uses precomputed lookup table for O(1) packet rate routing at millions of packets per second, trading rebuild cost for speed

✓Ring hashing memory: 1000 nodes with 1000 vnodes each requires 1 million entries at roughly 16 bytes per entry (node ID plus position), totaling around 16 MB per client

📌 Interview Tips

1Google jump hash: Internal shard selection for services where bucket count grows monotonically; layer of indirection maps jump hash buckets to physical servers

2Azure Traffic Manager: Uses rendezvous style scoring with geographic weights; for 20 global endpoints, O(N) scoring adds negligible latency versus DNS resolution time

3Google Maglev: L4/L7 load balancer with 65537 entry lookup table per Virtual IP (VIP); handles millions of packets per second with sub-microsecond per packet hashing

4Netflix EVCache: Client side ring hashing (Ketama) to select cache shards; 200 node memcache pool with 100 vnodes per node, binary search through 20,000 positions

← Back to Consistent Hashing Overview