Hasty Briefsbeta

Bilingual

Tracking down a 25% Regression on LLVM RISC-V

3 days ago
  • #Performance Optimization
  • #RISC-V
  • #LLVM
  • An LLVM commit improved isKnownExactCastIntToFP, enabling optimization of fpext(sitofp x to float) to double into uitofp x to double, but inadvertently broke a downstream narrowing optimization in visitFPTrunc.
  • This caused a ~24% performance regression on RISC-V targets because fdiv.d (33 cycle latency) was emitted instead of fdiv.s (19 cycle latency).
  • The fix extended getMinimumFPType with range analysis to recognize fptrunc(uitofp x double) to float can be reduced to uitofp x to float, restoring the narrowing optimization.
  • Analysis identified the regression in a benchmark where LLVM used fdiv.d in a loop, while GCC used fdiv.s, leading to increased cycles.
  • The issue was traced to a specific commit in InstCombine that changed isKnownExactCastIntToFP behavior, preventing visitFPTrunc from optimizing double to float narrowing.
  • A patch was submitted and merged, modifying canBeCastedExactlyIntToFP and getMinimumFPType to handle integer-to-FP casts with fptrunc, resulting in optimized float operations and performance recovery.