Advice to Tenstorrent
a year ago
- #Tenstorrent
- #AI Hardware
- #Programming
- Tenstorrent's advantage lies in greater programmability compared to GPUs.
- Hardware architecture influences model design; lack of exposed programmability ensures failure.
- Avoid overcomplicating with abstraction layers—focus on a streamlined runtime, compiler, and frontend.
- The runtime should be hardware-exposed and application-agnostic, avoiding unnecessary complexities like ELU.
- Start with the driver and runtime, ensuring they handle compilation, dispatch, and queuing efficiently.
- For the compiler, prioritize memory placement, operation scheduling, and kernel fusion over unnecessary features like ELU.
- The frontend must ensure performance parity between operations like ELU and ReLU before implementation.
- A simple, effective ELU implementation can be derived from ReLU operations.