Approximate Nearest Neighbor: HNSW vs IVF-PQ
WHY APPROXIMATE SEARCH
Exact search compares query against every vector. At 1B vectors, 1μs per comparison = 1000 seconds. ANN algorithms build index structures eliminating most comparisons. HNSW and IVF-PQ are the two dominant approaches with different trade-offs.
HNSW: GRAPH-BASED SEARCH
HNSW builds a multi-layer graph. Top layers have few nodes with long-range connections; bottom has all nodes with local connections. Search navigates top to bottom. Query: 1-10ms at billion scale. Memory: full vectors (1-4KB each). Best for: low latency, high recall.
IVF-PQ: QUANTIZED SEARCH
IVF clusters vectors, searches relevant clusters only. PQ compresses vectors from 1KB to 64 bytes. Combined: search subset of compressed vectors. Query: 5-50ms. Memory: 16-64x less than HNSW. Best for: memory efficiency, acceptable latency trade-off.
CHOOSING BETWEEN THEM
HNSW: latency critical (<5ms), memory available, recall >98%. IVF-PQ: memory constrained, 20-50ms acceptable, 90-95% recall OK. Hybrid: HNSW for hot data, IVF-PQ for cold.