Pass-by-Value Overhead
6 months ago
- #benchmarking
- #assembly
- #performance
- Passing structs by value vs. reference is a common dilemma for perfectionists.
- Benchmarking reveals that passing structs by value has overhead proportional to their size.
- Small structs (up to 256 bytes) are cheap to pass by value, using SIMD registers.
- Structs larger than 256 bytes use `rep movs`, which is slower than unrolled vectorized moves.
- Performance of `rep movs` is periodic, with spikes at specific struct sizes (e.g., 4064-4080 bytes).
- Passing 16-byte structs by value can be done at 730 million per second, while 2048-byte structs drop to 26 million per second.
- AMD Zen CPUs have performance bugs with `rep movs` at certain struct sizes.
- Recommendation: Avoid passing structs of problematic sizes (4046-4080, 8161-8176 bytes) by value on AMD 9 3900X.