NumKong: 2'000 Mixed Precision Kernels for All
6 hours ago
- #Numerical Computing
- #Open Source
- #SIMD
- NumKong is a large open-source project with over 2,000 SIMD kernels for mixed precision numerics across 7 languages.
- The project includes support for various hardware extensions like RISC-V Vector Extensions, Intel AMX, and Arm SME Tiles.
- NumKong offers high-performance implementations for geospatial calculations (Haversine & Vincenty) and mesh alignment (Kabsch & Umeyama).
- It supports a wide range of numeric types, from BFloat16 and Float16 down to Float6 and Int4/UInt4.
- NumKong provides WebAssembly SIMD backend for AI sandboxes, edge computing, and browsers.
- The project emphasizes numerical stability and precision, with benchmarks showing competitive performance against NumPy + OpenBLAS and PyTorch + MKL.
- NumKong is designed for USearch but is released for general use, with bindings for 14+ programming languages.
- The article discusses the challenges and state of RISC-V, Intel AMX, and Arm SME in 2026.
- NumKong includes advanced kernel design patterns like compensated summation and tabular lookups for efficient conversions.
- The project avoids hidden allocations and threads, focusing on explicit memory management and parallelism.