The Cost of a Closure in C: The Rest
4 months ago
- #C
- #closures
- #performance
- The article revisits performance benchmarks for C and C++ closure implementations, adding new categories for 'Plain C' functions.
- New benchmarking categories include 'Normal Functions', 'Normal Functions (Rosetta Code)', 'Normal Functions (Static)', and 'Normal Functions (Thread Local)', each with distinct performance characteristics.
- Performance insights reveal that indirect loads (e.g., through pointers) significantly degrade performance compared to direct variable access.
- Lambdas (and proposed 'Capture Functions') offer the best performance when type information is preserved, but type erasure in C imposes a performance cost.
- Using 'static' or 'thread_local' variables for context passing incurs performance penalties, with 'thread_local' being notably worse.
- GNU Nested Functions perform poorly, raising questions about their viability in ISO C standardization.
- Key takeaways include the superiority of type-preserving closures, the drawbacks of type erasure, and the performance costs of 'static' and 'thread_local'.
- The experimental setup involved 150 repetitions of benchmarks on a MacBook Pro M1, testing 13 categories of C/C++ code.