What is Proximity Search and Why Two Phase Architecture?
Why Naive Approach Fails
Computing distance to every item and filtering is O(n). For 100 million locations, even at 1 microsecond per distance calculation, that is 100 seconds per query. Unacceptable for real-time applications. You need to eliminate most candidates without computing distance.
Indexes solve this by organizing data spatially. Instead of checking every point, check only points in relevant regions. The index tells you which regions might contain nearby points. Check only those.
Two Phase Architecture
Phase 1 - Coarse filtering: Use spatial index to find candidate regions. Retrieve all items in those regions. This is fast because indexes are optimized for region lookup. But it over-fetches: regions are rectangles or cells, not circles.
Phase 2 - Fine filtering: For each candidate, compute exact distance. Filter items outside the actual radius. Sort by distance if returning nearest neighbors. Apply business filters (ratings, availability). This phase is O(k) where k is candidate count, not O(n).
The architecture trades accuracy for speed in phase 1, then recovers accuracy in phase 2. Phase 1 reduces millions to thousands. Phase 2 reduces thousands to tens. Together they achieve sub-100ms latency.