Even Faster Asin() Was Staring Right at Me
10 hours ago
- #benchmarking
- #optimization
- #performance
- The author revisited their previous article on performance optimization, specifically focusing on improving the Cg asin() approximation.
- The original implementation was rewritten to use Estrin's Scheme, reducing dependency chains and enabling parallel execution on modern CPUs.
- Benchmarking was conducted across multiple environments (Intel i7, AMD Ryzen 9, Apple M4) with different compilers (GCC, Clang, MSVC).
- Results showed significant speedups on older Intel chips, minimal impact on AMD, and slight improvements on Apple M4 with Clang.
- A real-world test with a ray tracer showed a 3% speedup on Intel, while the Apple M4 saw negligible changes.
- The author emphasized the importance of benchmarking and compiler optimizations, advising against using LUTs due to higher error rates.
- The article concludes with a reminder that the method is an approximation and encourages collaboration and reevaluation for better solutions.