Geospatial & Location Services • GeohashingEasy⏱️ ~3 min
What is Geohashing and How Does it Work?
Geohashing is a spatial indexing technique that encodes a latitude/longitude coordinate pair into a single compact string by interleaving the bits of both values. The algorithm uses a Morton or Z order curve to create a hierarchical grid system where the world is recursively partitioned into 32 way grids. Each character in the resulting base 32 string selects one of 32 sub cells from the previous level, creating progressively smaller rectangles.
The core power of geohashing comes from prefix locality: spatially close locations tend to share longer string prefixes. For example, two coffee shops 100 meters apart might share the first 7 characters of their geohash, while shops in different cities share only 3 or 4 characters. This property enables fast proximity searches using simple range scans in ordered key value stores, rather than complex geometric operations.
Precision is discrete and global. A 1 character geohash represents roughly 5,000 km cells, 5 characters give 4.9 km cells, 7 characters give 153 meter cells, and 9 characters give 4.8 meter cells. However, these cells are rectangles in latitude/longitude space and are not equal area. Near the equator, a 6 character geohash covers approximately 0.61 km by 1.22 km, but near the poles cells shrink significantly in the longitude dimension due to meridian convergence.
The typical storage format uses either base 32 strings or more commonly a compact 52 to 64 bit integer representation, consuming only 8 bytes versus 16 bytes for storing two double precision floats. This compactness combined with ordering properties makes geohashing ideal for high throughput distributed systems that need to index millions to billions of location points.
💡 Key Takeaways
•Encodes latitude/longitude into a hierarchical base 32 string using bit interleaving, where each additional character refines location by selecting 1 of 32 sub cells
•Prefix locality enables fast range scans: nearby points share longer prefixes, so scanning contiguous key ranges with a common prefix finds nearby objects without complex geometry
•Storage is compact at 8 bytes for 64 bit integer geohash versus 16 bytes for two double floats, reducing memory footprint by 50 percent in large indexes
•Cells are rectangles not equal area: 6 character cells near equator are roughly 0.61 km by 1.22 km but shrink in longitude near poles due to meridian convergence
•Precision steps are discrete and global: you choose from fixed cell sizes (5,000 km, 4.9 km, 153 m, 4.8 m, etc.) rather than arbitrary radius values
📌 Examples
Production key value pattern: Store user location as 64 bit geohash integer as secondary index key alongside user ID. For 'users within 1 km' query, compute center geohash at precision 7 (153 m cells), generate 8 neighbors, scan 9 contiguous key ranges. With 100 items per cell, scan returns ~900 candidates, post filter to ~300 actual results within radius. Achieves single digit millisecond p50 latency at tens of thousands queries per second per node.
Firebase GeoFire pattern: Mobile app writes user location as 7 character geohash. Proximity query for 1 km radius expands to 8 to 12 geohash prefix ranges, client subscribes to realtime updates on those ranges. Typical 10 to 50 percent overfetch trades for simple range subscription logic.
Aggregation example: Heatmap of 100 million delivery locations uses 5 character precision (4.9 km cells) for city level view. Maximum 32 to the power of 5 equals 33.6 million theoretical buckets, but only 50,000 to 500,000 buckets actually populated globally, easily fitting aggregation state in memory.