Prefix Binning Pattern and Precision Selection

Prefix Binning Pattern
Store points with their geohash. Index the geohash column. To find points in an area, compute the geohash prefix that covers it. Query for all rows where geohash starts with that prefix. The B-tree index makes this fast regardless of total point count.
This is binning: grouping points into cells by their geohash prefix. A 6 character prefix bins points into cells about 1 km wide. All points in a cell share the same prefix. Querying the prefix retrieves the entire bin in one index seek.
Choosing Precision Level
Too coarse: cells contain too many points. A 4 character geohash (39 km cells) for a city contains millions of points. You retrieve them all and filter most away. Wasted I/O and CPU.
Too fine: cells contain too few points. A 9 character geohash for a city requires querying thousands of cells for a 1 km radius. Each query is fast but the sum is slow. Network round trips add up.
Right precision: cells are slightly larger than your typical query radius. For 1 km queries, use 6 characters (1.2 km cells). Most queries touch 1 to 9 cells. Few enough for fast aggregation, coarse enough to avoid excessive queries.
Multi-Precision Indexing
Store multiple precision levels per point. Index both 4 character and 7 character geohashes. Use coarse precision for wide area queries (find all restaurants in a city). Use fine precision for local queries (find restaurants within 500 meters).
The storage cost is minimal: a few extra bytes per row. The query benefit is significant: each query uses the optimal precision level. Avoid recalculating geohashes at query time by precomputing them at write time.
✅ Best Practice: Match geohash precision to your dominant query radius. If 80% of queries are for 1 km radius, use 6 character geohashes. Store additional precision levels only if you have genuinely different query patterns.

💡 Key Takeaways

✓Prefix binning groups points by geohash prefix; B-tree index enables fast lookup

✓Too coarse precision retrieves too many points; too fine requires too many queries

✓Match precision to query radius: 6 characters for 1 km, 7 for 150 m

✓Multi-precision indexing supports different query sizes without recalculation

✓Precompute geohashes at write time; store multiple precision levels if needed

📌 Interview Tips

1Explain precision trade-off: 4 character geohash for 1 km query returns millions of points to filter; 9 character requires thousands of queries

2Recommend matching precision to dominant use case: ride sharing with 5 km pickup radius uses 5 character geohashes

3For variable radius queries, store both coarse and fine geohashes to avoid query-time calculation overhead

← Back to Geohashing Overview