Hasty Briefsbeta

Bilingual

Elasticsearch's BBQ vs. TurboQuant: 10–40× faster on CPU and lower ranking noise

11 hours ago
  • #quantization
  • #vector-search
  • #performance
  • Elasticsearch provides extensive developer tools including vector search and REST APIs.
  • Elasticsearch's Optimized Scalar Quantization (OSQ) outperforms TurboQuant in CPU vector search throughput, ranking accuracy, and storage efficiency.
  • Scalar quantization compresses embedding vectors to small integers to reduce storage and speed up scoring.
  • OSQ uses uniform grid quantization with features like centering and anisotropic loss to improve accuracy.
  • TurboQuant uses a Hadamard rotation and non-uniform centroids, focusing on optimal MSE but with computational tradeoffs.
  • In tests, OSQ's symmetric kernels are 10-40x faster than TurboQuant, especially on Apple M2 Max.
  • OSQ's block-diagonal preconditioner matches Hadamard benefits without padding overhead.
  • On dot-product accuracy, OSQ excels in small-angle scenarios and with shifted data due to centering.
  • TurboQuant's throughput is limited by data-dependent gather operations compared to OSQ's integer arithmetic.
  • For CPU-based search, OSQ is superior in throughput, ranking accuracy, and storage efficiency.