Sep 0.11.0 – 9.5 GB/S CSV Parsing Using ARM Neon SIMD on Apple M1
a year ago
- #Performance
- #ARM
- #SIMD
- Sep 0.11.0 was released with a new ARM NEON SIMD parser optimized for ARM64 CPUs like Apple M1 and Microsoft Cobalt 100.
- Performance improvements: 9.5 GB/s on Apple M1 (up from ~7 GB/s) and ~6 GB/s on Cobalt 100 (up from ~4 GB/s).
- Sep is now ~14x faster than CsvHelper and 6.4x faster than Sylvan.Data.Csv on Apple M1.
- The new parser uses ARM NEON SIMD (AdvSimd in .NET) for efficient CSV parsing, leveraging saturated conversion and optimized move mask operations.
- Performance benchmarks were conducted on GitHub runners, showing higher variance due to virtual machine environments.
- The blog post details the technical approach, including disassembling ARM64 code using NativeAOT and dumpbin on Windows.
- Key optimizations include handling 1024 bits (64 chars) at a time and using Geoff Langdale's efficient bulk move mask technique for ARM NEON.
- Links to additional resources on ARM SIMD optimizations and techniques are provided for further reading.