ScaNN: Learning Based Quantization for Maximum Inner Product Search
WHAT MAKES SCANN DIFFERENT
ScaNN (Scalable Nearest Neighbors) optimizes specifically for Maximum Inner Product Search (MIPS), the similarity metric used by most embedding models. Unlike FAISS which uses generic k-means and symmetric quantization, ScaNN learns data-specific partitioning and asymmetric quantization that maximize inner product preservation.
The key insight: for MIPS, errors in different directions have different costs. ScaNN learns an anisotropic (direction-dependent) loss function that penalizes errors based on their impact on inner product scores, not just Euclidean distance.
ANISOTROPIC VECTOR QUANTIZATION
Standard PQ minimizes reconstruction error uniformly in all directions. But for inner products, error along the query direction matters more than error perpendicular to it. ScaNN weights quantization error by how much it affects inner product with likely queries.
Training: sample query distribution, compute how each quantization error affects inner product scores, weight codebook training accordingly. Result: 10-30% higher recall at the same memory footprint compared to symmetric PQ.
ASYMMETRIC DISTANCE COMPUTATION
ScaNN uses asymmetric distance: query vectors are kept at full precision, only database vectors are quantized. At query time, compute exact distances between full-precision query subvectors and quantized database codes.
Why asymmetric? Query computation happens once per query, database code storage happens once per vector. Keeping queries unquantized adds negligible query-time cost but significantly improves accuracy compared to quantizing both sides.
WHEN TO USE SCANN VS FAISS
Use ScaNN when: Your similarity metric is inner product or cosine (which converts to inner product after normalization). Your query distribution is stable enough to train anisotropic quantization. You need maximum recall per byte of memory.
Use FAISS when: You need L2 distance. Your system is already built on FAISS ecosystem. You need features ScaNN lacks (GPU acceleration, certain index types).
Benchmarks show ScaNN achieves 10-30% higher recall than FAISS IVF-PQ at equivalent memory usage for MIPS workloads. The gap narrows for L2 distance where ScaNNs MIPS optimizations do not apply.