Geospatial & Location ServicesMap Matching & RoutingHard⏱️ ~3 min

Implementing HMM Map Matching with Spatial Indexes

The core implementation revolves around efficiently finding candidate road edges for each GPS point and computing probabilities across the resulting lattice. Store the road network as a directed multigraph with nodes for intersections and edges for road segments, each edge annotated with length, directionality, speed limit, and optionally historical travel time distributions. Build a spatial index over edges using an R-tree keyed by edge bounding boxes, enabling fast range queries to find candidates within a search radius. For each incoming GPS point, query the R-tree with a search radius tuned to the environment: 20 to 50 meters in suburban areas, up to 100 meters in dense urban cores. Limit results to K nearest neighbors, typically 5 to 10 candidates, to cap lattice complexity. Compute the perpendicular distance from the point to each candidate edge and the projection point on that edge. The emission probability is modeled as a Gaussian over this distance, scaled by the reported GPS accuracy: probability proportional to exp of negative squared distance over twice the variance. Optionally incorporate bearing difference between the GPS heading and edge direction, and speed consistency. Transition probabilities compare the shortest path distance on the road graph between consecutive projection points with the straight line haversine separation divided by the time delta. If the graph distance closely matches the expected travel distance, the transition is likely. Large detours or routes requiring illegal turns get penalized. Run Viterbi dynamic programming over the lattice: for each candidate at time t, compute the maximum probability path from time 0 considering all candidates at time t minus 1 and their transition probabilities. Traceback from the highest probability final state to reconstruct the matched sequence. Use a sticky bias to prefer staying on the current edge when evidence is weak, reducing spurious jumps.
💡 Key Takeaways
R-tree spatial index over edge bounding boxes enables fast candidate queries with search radius 20 to 100 meters depending on urban density, returning K nearest neighbors typically 5 to 10
Emission probability models as Gaussian over perpendicular distance scaled by GPS accuracy: exp of negative squared distance over 2 times variance, optionally weighted by bearing and speed consistency
Transition probability compares shortest path distance on road graph between projections with haversine straight line separation divided by time delta, penalizing large detours and illegal turns
Viterbi dynamic programming computes max probability path: for each candidate at time t, consider all candidates at t minus 1 and their transitions, traceback from highest probability final state
Sticky bias prefers staying on current edge when evidence is weak, reducing spurious jumps between parallel roads that plague systems without temporal consistency
Missing segments between matched projections require Dijkstra shortest path routing to interpolate, with traversal times from historical distributions indexed by H3 cells for fast spatial lookup
📌 Examples
Search radius of 50 meters in suburban area returns 5 candidate edges per point, generating 5 to the power of 10 equals 9.7 million possible paths for 10 points, pruned by Viterbi
Emission probability for perpendicular distance 15 meters with GPS accuracy 20 meters yields probability proportional to exp of negative 15 squared over 2 times 20 squared equals 0.68
Transition between projections 500 meters apart with haversine 450 meters over 30 seconds yields expected speed 15 meters per second, matching highway speed limit increases probability
Fast Map Matching implementation processes 45,000 points per second on single server using precomputed spatial indexes and optimized Viterbi lattice pruning
← Back to Map Matching & Routing Overview
Implementing HMM Map Matching with Spatial Indexes | Map Matching & Routing - System Overflow