Ultrafast machine learning on FPGAs via Kolmogorov-Arnold Networks

3 hours ago

#Machine Learning
#FPGA
#KAN

Master's thesis explores hardware architectures for ultrafast inference and online learning using Kolmogorov-Arnold Networks (KANs) on FPGAs.
KANs replace learnable weights and fixed activation functions in MLPs with learnable univariate activation functions, offering potential improvements in scaling and parameter efficiency.
Fixed-point quantization is used to encode real numbers as bitstrings, enabling neural networks to be implemented as digital logic on FPGAs with minimal approximation error.
KANs are naturally suited for lookup-table neural networks (LUT-NNs) due to their univariate activations, avoiding exponential scaling issues of multivariate LUTs and enabling efficient pruning.
For inference, KAN activations are stored as LUTs on FPGAs, achieving a 2700x speedup over prior implementations and surpassing state-of-the-art FPGA accelerators in latency and resource usage.
Online learning on FPGAs is enabled by storing B-spline basis functions in LUTs and updating coefficients in real-time, leveraging locality and boundedness for stable, sub-microsecond gradient updates.
B-spline locality ensures only a small subset of basis functions are active per input, scaling hardware logic with polynomial order rather than grid size, improving expressivity without resource overhead.
Stable fixed-point training in KANs is achieved because activations and gradients are bounded within coefficient ranges, reducing quantization error and enhancing learning stability compared to MLPs.
Implementation demonstrates KAN-based online learners can handle over 100,000 parameters with sub-microsecond latency, showing better hardware scaling and convergence on benchmarks like function approximation and quantum control.
Conclusion highlights that KAN properties, such as activation mapping to LUTs and B-spline characteristics, are highly advantageous for custom hardware accelerators, enabling nanosecond-latency inference and efficient real-time learning.

Hasty Briefsbeta

Ultrafast machine learning on FPGAs via Kolmogorov-Arnold Networks