Embeddings & Similarity SearchApproximate Nearest Neighbors (FAISS, ScaNN, HNSW)Medium⏱️ ~2 min

HNSW: Graph Based Proximity Search

WHAT IS HNSW

Hierarchical Navigable Small World (HNSW) organizes vectors into a multi-layer graph structure. Each vector is a node; edges connect nearby vectors. The hierarchy resembles a skip list: top layers have sparse long-range connections for fast navigation, bottom layers have dense local connections for precision.

Search starts at a fixed entry point in the highest layer. From there, greedily walk toward the query by always moving to the neighbor closest to the query. When no closer neighbor exists, drop to the next layer and repeat. The bottom layer search finds the final candidates.

HOW THE HIERARCHY WORKS

Layer 0 (bottom) contains all vectors with dense connections—typically 32-64 edges per node. Higher layers contain exponentially fewer nodes, each with long-range connections spanning the embedding space.

Example: 1M vectors. Layer 0 has 1M nodes. Layer 1 has ~100K. Layer 2 has ~10K. Layer 3 has ~1K. Search in top layers crosses large distances quickly (100 hops might traverse the whole space). Final search in layer 0 refines to exact local neighborhood.

The probability of a node appearing in layer L is p^L where p is typically 1/M (M is the connectivity parameter). M=16 means ~6% of nodes appear in layer 1, ~0.4% in layer 2.

KEY PARAMETERS

M (max connections per node): Higher M = better recall but more memory and slower construction. Typical values: 16-64. Each doubling of M roughly doubles index size.

efConstruction: How many candidates to consider when building the graph. Higher = better graph quality but slower indexing. Range: 100-500. Use higher values for one-time offline builds.

efSearch: How many candidates to explore at query time. Higher = better recall but higher latency. Start at 64, tune up until recall target is met. This is your primary latency-recall knob.

PERFORMANCE CHARACTERISTICS

Memory: ~4 bytes per dimension + M × 4 bytes per link × layers. For 768-dim vectors with M=16, expect ~4KB per vector total. 1M vectors = 4GB index.

Latency: 1-5ms for top-10 at recall 0.95 on 1M vectors. Scales as O(log N) with dataset size.

✅ Best Practice: HNSW excels when you have RAM to spare and need low latency with high recall. It is the default choice for in-memory vector search. Trade-off: index build is slow (hours for 100M vectors) and updates are expensive.
💡 Key Takeaways
HNSW uses multi-layer graph: sparse long-range connections at top, dense local connections at bottom
Search is greedy descent: start at top layer, walk toward query, drop layers until bottom
Key parameters: M (connections), efConstruction (build quality), efSearch (recall-latency knob)
Memory: ~4KB per 768-dim vector with M=16. Latency: 1-5ms for recall 0.95 on 1M vectors
📌 Interview Tips
1Interview Tip: Explain the layer hierarchy with concrete numbers—1M vectors, layers 0-3, node distribution.
2Interview Tip: Describe the efSearch parameter as the primary tuning knob for recall vs latency tradeoff.
← Back to Approximate Nearest Neighbors (FAISS, ScaNN, HNSW) Overview