Python 3.15's interpreter for Windows should be 15% faster
4 months ago
- #Python
- #Performance Optimization
- #Compiler
- Partial retraction of an apology for Python's tail calling results due to performance improvements on MacOS AArch64 (XCode Clang) and Windows x86-64 (MSVC).
- Tail calling interpreter for CPython shows a 5% speedup on pyperformance for AArch64 macOS and roughly 15% on Windows with an experimental MSVC version.
- Comparison of interpreter writing methods: switch-case, computed gotos (labels-as-values), and tail-call threaded interpreters, highlighting the benefits of tail-call optimization with modern compilers.
- Introduction of __attribute__((musttail)) in Clang to ensure tail call optimization, improving interpreter performance by resetting compiler heuristics to sane levels.
- Performance improvements observed in CPython 3.15 with tail-calling interpreter, especially on Windows x86-64, with speedups ranging from 15% to 40%.
- Explanation of how tail calling helps in inlining simple functions within the interpreter loop, leading to better performance.
- Instructions for building CPython from source with tail-call interpreter support using Visual Studio 2026.