Hasty Briefsbeta

Bilingual

Why is calling my asm function from Rust slower than calling it from C?

4 months ago
  • #profiling
  • #Rust
  • #performance
  • Identified a specific assembly function, cdef_filter4_pri_edged_8bpc_neon, that was 30% slower in the Rust implementation compared to the C baseline.
  • Discovered that the slowdown was due to slower data loading in the Rust version, caused by excessive stack data storage.
  • Found that the root cause was the compiler's inability to optimize a Rust abstraction across function pointers.
  • Implemented a fix by making the WithOffset struct FFI-safe and restructuring how data is passed across the FFI boundary, reducing the performance gap to within 5% of the C version.
  • Used profiling tools like samply and cargo asm to diagnose and verify the performance improvements.