KNN early termination in Manticore Search
5 days ago
- #Manticore Search
- #vector search
- #HNSW algorithm
- Vector search converts documents and queries into numerical embeddings and finds the closest matches.
- Manticore Search uses HNSW for fast vector search but faces inefficiencies as the algorithm continues exploring after results converge.
- Early termination detects convergence by monitoring discovery rates and stops the search early to save computational effort.
- The algorithm uses an adaptive threshold based on quantiles to decide when to stop, ensuring minimal precision loss (2-4%).
- Savings increase with larger k values, reducing distance computations by up to 80% at k=10000.
- Under concurrent load, latency improvements nearly double due to reduced memory and cache pressure.
- Early termination is enabled by default but can be disabled for maximum precision or small k values.
- It works with quantized vectors and oversampling, enhancing efficiency in expanded searches.
- The feature complements other KNN optimizations like prefiltering and rescoring without interference.