Faster KNN search in Manticore: 2-pass HNSW, batched distances, and AVX-512
21 hours ago
- #AVX-512 Performance
- #KNN Search Optimization
- #HNSW Algorithm
- Manticore Search 27.1.5 introduces three optimizations to its HNSW-based KNN search for faster throughput: 2-pass neighbor processing with prefetching, batched distance computation, and compile-time distance function specialization.
- The 2-pass neighbor processing splits neighbor visits into two phases: collecting unvisited neighbors with prefetching in Pass 1 and computing distances in Pass 2, improving cache efficiency and enabling batching.
- Batched distance computation processes candidates in pairs, reducing redundant query vector loads, and is supported for inner product, L2, and binary-quantized variants.
- AVX-512 support provides new distance functions that process 16 floats per iteration, with automatic CPU detection and library loading, though it may show slight regressions at low k values.
- Benchmarks on a dbpedia-openai-1M-1536-angular dataset show algorithmic improvements alone yield up to 24% higher throughput at high k on a single thread, with combined AVX-512 and algorithmic changes reaching up to 29% improvement.
- Gains scale with k as distance computation dominates workload; they also benefit from high-dimensional vectors, oversampling, and work with existing features like early termination and filtering without API changes or index rebuilds.