Geospatial & Location ServicesProximity SearchHard⏱️ ~3 min

Distribution, Sharding, and Caching Strategies

At scale, proximity search must distribute across multiple nodes and leverage caching to meet latency Service Level Objectives (SLOs). The core challenge is minimizing cross shard fanout while keeping load balanced. Sharding by geospatial key (geohash prefix or H3 cell range) achieves excellent locality: queries route only to shards whose key ranges overlap the search area. For example, sharding Airbnb listings by geohash prefix length 4 means a search in San Francisco (geohash starting with 9q8y) only hits the 9q8 shard cluster, avoiding fanout to New York or London shards. Boundary queries are the hard case. A user standing on the border between two geohash regions needs candidates from both shards. The solution is to include immediate neighbor shards in the query or use overlapping hierarchical keys. Uber dispatch queries the rider's H3 cell plus its k ring neighbors; since H3 provides O(1) neighbor enumeration, the system routes to typically 2 to 4 shard groups even for boundary cases. The coordinator merges results using a bounded min heap, selecting top k by distance in O(k log s) time where s is the number of shards queried. Caching is critical but tricky due to staleness. For static entities like restaurants or stores, caching cell level candidate IDs with Time To Live (TTL) of minutes or hours is safe and yields 80% to 95% hit rates for repeated map pan queries. For dynamic entities like drivers, a 1 to 10 second TTL balances freshness and cache utility. A common optimization is to cache precomputed top k results per popular radii (250 m, 500 m, 1 km) and snap query centers to cell centroids to normalize cache keys. Airbnb applies this for high Query Per Second (QPS) metropolitan viewport queries, keeping backend load manageable during peak traffic. Hotspot mitigation requires cache partitioning and admission control. Events like concerts or sales drive localized QPS spikes in a handful of cells. Without partitioning, hot cell entries evict useful entries from other regions, tanking global hit rates. Systems partition caches by metro area or top level geohash prefix and use token bucket admission to limit cache churn. When ingestion lags due to backpressure, systems degrade gracefully by widening slop or radius slightly but enforcing hard candidate caps (for example, 10,000 points) to protect tail latency.
💡 Key Takeaways
Shard by geospatial key (geohash prefix or H3 cell range) to route queries only to shards overlapping the search area, avoiding broadcast fanout to all nodes
Boundary queries require neighbor shard inclusion; Uber queries the rider H3 cell plus k ring neighbors, typically hitting 2 to 4 shard groups with O(1) neighbor enumeration
Cache static entities with TTL of minutes to hours achieving 80% to 95% hit rates; cache dynamic entities like drivers with 1 to 10 second TTL balancing freshness and utility
Normalize cache keys by snapping query centers to cell centroids and caching precomputed top k for popular radii (250 m, 500 m, 1 km) to maximize hit rates during map panning
Partition caches by metro area to prevent hot cells from concerts or sales evicting useful entries globally; use token bucket admission control to limit cache churn during QPS spikes
📌 Examples
Airbnb shards 7 million listings by geohash prefix ensuring that San Francisco searches (9q8 prefix) only query California shards, not European or Asian clusters, reducing cross region network hops
Uber dispatch coordinators merge driver candidates from 2 to 4 H3 shard groups using a min heap sorted by distance; typical merge time is under 5 milliseconds even for boundary riders
← Back to Proximity Search Overview
Distribution, Sharding, and Caching Strategies | Proximity Search - System Overflow