Compiling a Neural Net to C for a 1,744× speedup
a year ago
- #game-of-life
- #neural-networks
- #optimization
- A neural network (NN) with logic gates as activation functions was trained to learn a 3×3 kernel function for Conway’s Game of Life.
- The NN was compiled to C, resulting in a 1,744× speedup in inference time.
- The project involved extracting and optimizing a learned logic circuit from the NN, reducing it to a 300-line single-threaded C program.
- Differentiable Logic Cellular Automata (DLCA) combines neural networks with cellular automata, replacing traditional kernel functions with learned NN models.
- Training the NN required careful initialization, including biasing gate weights towards pass-through gates to ensure gradient flow.
- The final C implementation uses bit-parallelism to process 64 cells simultaneously, significantly boosting performance.
- Benchmarks showed the C implementation achieving 24,400 fps compared to the Python/JAX implementation's 14 fps.
- Future directions include optimizing the circuit further with SIMD or GPU compute shaders and exploring larger circuits like fluid simulation.