The state of SIMD in Rust in 2025
17 days ago
- #Rust
- #Performance
- #SIMD
- SIMD (Single Instruction, Multiple Data) allows CPUs to process multiple data points with a single instruction, improving performance.
- Different architectures have their own SIMD extensions: ARM has NEON, WebAssembly has 128-bit packed SIMD, and x86 has SSE2, AVX, AVX2, and AVX-512.
- x86 CPUs may not support all SIMD extensions, requiring function multiversioning or specific compilation flags.
- Four approaches to SIMD in Rust: automatic vectorization, fancy iterators, portable SIMD abstractions, and raw intrinsics.
- Automatic vectorization is compiler-driven but limited, especially with floating-point operations.
- Fancy iterators (e.g., `faster` crate) aim to parallelize iterators for SIMD but have seen limited success.
- Portable SIMD abstractions (e.g., `std::simd`, `wide`, `pulp`, `macerator`) offer cross-platform support but vary in maturity and features.
- Raw intrinsics provide low-level control but require platform-specific implementations and manual multiversioning.
- Choosing the right approach depends on use case: autovectorization for simplicity, intrinsics for specific hardware, and portable abstractions for broader compatibility.