C++26 Shipped a SIMD Library Nobody Asked For
5 hours ago
- #Performance Optimization
- #SIMD
- #C++ Programming
- C++26 includes a SIMD library named std::simd for portable SIMD abstraction, aiming to write SIMD code once and compile for multiple architectures.
- The library suffers from critical deficiencies: slower compilation (10x), slower runtime than scalar loops, and inability to express many real SIMD operations.
- It defaults to 128-bit width regardless of hardware, losing performance benefits compared to auto-vectorized scalar loops.
- Alternatives like Google Highway and SIMDe have emerged, offering better portability, runtime dispatch, and broader architecture support.
- The library-based approach hinders compiler optimization due to template abstraction, affecting algebraic simplifications and cross-lane operations.
- Key SIMD features like shuffles, width-specific arithmetic, and alignment support are missing in std::simd, covering only trivial element-wise operations.
- Language-level fixes, such as improved integer promotion and alignment in the type system, are needed but not addressed by std::simd.
- ISPC provides a language-level SIMD solution with better code generation, but it requires a separate compiler and build process.
- Overall, std::simd is criticized as an outdated solution that doesn't meet the needs of performance-critical SIMD programming.