What is Geohashing and How Does it Work?

Definition
Geohashing encodes latitude and longitude into a single string by recursively subdividing the world into smaller rectangles. Each character narrows the area. Nearby locations share common prefixes, enabling spatial queries with simple string operations.
How Encoding Works
Start with the entire world: latitude -90 to 90, longitude -180 to 180. Divide into two halves. If the point is in the right half, write 1. Left half, write 0. Repeat for latitude: top half is 1, bottom is 0. Interleave bits: longitude, latitude, longitude, latitude.
After enough divisions, convert the binary string to base32. Each base32 character represents 5 bits. A 6 character geohash like "9q8yyk" represents roughly a 1.2 km by 600 m rectangle. More characters mean smaller rectangles and higher precision.
Why Geohashing Matters
Standard databases index single columns efficiently. Two dimensional queries on separate lat/lon columns require scanning or complex composite indexes. Geohashing converts 2D to 1D: a spatial query becomes a string prefix query. Standard B-tree indexes work perfectly.
Prefix matching finds nearby points. All locations in "9q8yy" share that prefix. Query for "9q8yy%" and you get all points in that cell. This is O(log n) with a B-tree index, not O(n) scanning. For millions of points, this difference determines whether queries take milliseconds or minutes.
Precision Levels
Each character roughly halves dimensions. 4 characters: 39 km by 19 km cells. 6 characters: 1.2 km by 600 m. 8 characters: 38 m by 19 m. 12 characters: sub-centimeter precision. Choose based on query radius: use the coarsest precision that still filters effectively.
💡 Key Insight: Geohashing trades perfect accuracy for indexability. It converts expensive 2D range queries into cheap 1D prefix queries. The approximation (rectangular cells, edge cases) is acceptable because post-filtering handles precision.

💡 Key Takeaways

✓Geohash encodes lat/lon into a string; nearby points share prefixes

✓Converts 2D spatial queries to 1D string prefix queries for B-tree indexing

✓6 characters: 1.2 km by 600 m cells; 8 characters: 38 m by 19 m

✓Each additional character roughly halves cell dimensions

✓Post-filter results for exact distance because cells are rectangular not circular

📌 Interview Tips

1Explain encoding: interleave longitude and latitude bits, convert to base32. Each character adds 5 bits of precision

2When asked why geohashing matters, emphasize the B-tree advantage: spatial queries become O(log n) prefix lookups

3Calculate precision needed: for 1 km radius queries, 6 character geohashes (1.2 km cells) work well with neighbor expansion

← Back to Geohashing Overview