HNSW: Graph Based Proximity Search
WHAT IS HNSW
Hierarchical Navigable Small World (HNSW) organizes vectors into a multi-layer graph structure. Each vector is a node; edges connect nearby vectors. The hierarchy resembles a skip list: top layers have sparse long-range connections for fast navigation, bottom layers have dense local connections for precision.
Search starts at a fixed entry point in the highest layer. From there, greedily walk toward the query by always moving to the neighbor closest to the query. When no closer neighbor exists, drop to the next layer and repeat. The bottom layer search finds the final candidates.
HOW THE HIERARCHY WORKS
Layer 0 (bottom) contains all vectors with dense connections—typically 32-64 edges per node. Higher layers contain exponentially fewer nodes, each with long-range connections spanning the embedding space.
Example: 1M vectors. Layer 0 has 1M nodes. Layer 1 has ~100K. Layer 2 has ~10K. Layer 3 has ~1K. Search in top layers crosses large distances quickly (100 hops might traverse the whole space). Final search in layer 0 refines to exact local neighborhood.
The probability of a node appearing in layer L is p^L where p is typically 1/M (M is the connectivity parameter). M=16 means ~6% of nodes appear in layer 1, ~0.4% in layer 2.
KEY PARAMETERS
M (max connections per node): Higher M = better recall but more memory and slower construction. Typical values: 16-64. Each doubling of M roughly doubles index size.
efConstruction: How many candidates to consider when building the graph. Higher = better graph quality but slower indexing. Range: 100-500. Use higher values for one-time offline builds.
efSearch: How many candidates to explore at query time. Higher = better recall but higher latency. Start at 64, tune up until recall target is met. This is your primary latency-recall knob.
PERFORMANCE CHARACTERISTICS
Memory: ~4 bytes per dimension + M × 4 bytes per link × layers. For 768-dim vectors with M=16, expect ~4KB per vector total. 1M vectors = 4GB index.
Latency: 1-5ms for top-10 at recall 0.95 on 1M vectors. Scales as O(log N) with dataset size.