Even Faster Asin() Was Staring Right at Me

2 months ago

The author revisited their previous article on performance optimization, specifically focusing on improving the Cg asin() approximation.
The original implementation was rewritten to use Estrin's Scheme, reducing dependency chains and enabling parallel execution on modern CPUs.
Benchmarking was conducted across multiple environments (Intel i7, AMD Ryzen 9, Apple M4) with different compilers (GCC, Clang, MSVC).
Results showed significant speedups on older Intel chips, minimal impact on AMD, and slight improvements on Apple M4 with Clang.
A real-world test with a ray tracer showed a 3% speedup on Intel, while the Apple M4 saw negligible changes.
The author emphasized the importance of benchmarking and compiler optimizations, advising against using LUTs due to higher error rates.
The article concludes with a reminder that the method is an approximation and encourages collaboration and reevaluation for better solutions.

Hasty Briefsbeta