Approximate Nearest Neighbor Search: Trading Exactness for Scale
THE PROBLEM WITH EXACT SEARCH
Each item is represented as a vector (a list of numbers, typically 64 to 256 dimensions). To find similar items, compute the distance between the query vector and every item vector. With 100 million items and 128 dimensions, that is 12.8 billion floating point operations per query. At 10 GFLOPS per core, one query takes 1.3 seconds. Users expect results in 50 milliseconds. Exact search is fundamentally too slow at scale.
HOW ANN WORKS
ANN algorithms build index structures during an offline phase. At query time, they navigate these structures to quickly find candidate regions likely to contain nearest neighbors. Instead of checking 100 million items, they check perhaps 10,000. The index structure trades preprocessing time and memory for query speed.
RECALL AS THE QUALITY METRIC
ANN quality is measured by recall at k: what fraction of the true k nearest neighbors does the algorithm find? Recall of 0.95 means 95 of 100 true neighbors are returned. Most applications tolerate 95% recall for 100x speed improvement.