Failure Modes and Edge Cases in Production
Missing Results at Boundaries
A point 100 meters away in an adjacent cell is missed if you only query the center cell. This is the most common proximity search bug. Users report missing nearby restaurants that clearly exist.
Solution: always query neighbor cells. For any point query, fetch the target cell plus all adjacent cells. This guarantees no misses within cell diameter. Still post-filter by actual distance because cell edges are not circular.
Stale Index Data
Location data changes: restaurants open and close, drivers move. If index updates lag behind reality, queries return stale results. A user sees a restaurant that closed yesterday or a driver who is now 5 km away.
Mitigation: short TTL on cached results, frequent index refresh, and real-time validation of critical data. For ride-sharing, validate driver position before confirming match, not just during search. Accept that search results are approximate; confirmations are exact.
Query Timeout Under Load
Phase 1 returns 100,000 candidates instead of expected 1,000. Phase 2 distance calculations take 10 seconds instead of 10 milliseconds. The query times out. User sees error.
Causes: unexpectedly dense area, precision mismatch, missing index. Mitigations: limit candidate count with early termination, add fallback with coarser results, monitor candidate counts to detect anomalies before users complain.