A minimal tensor processing unit (TPU), inspired by Google's TPU
5 days ago
- #hardware-design
- #open-source
- #TPU
- A minimal tensor processing unit (TPU) reinvented from Google's TPU V2 and V1.
- Function: Performs multiply-accumulate operations every clock cycle.
- Data Flow: Input values flow horizontally, partial sums vertically, weights remain fixed.
- Architecture: Grid of processing elements (starting 2x2), with rotated input matrices.
- Modules: Bias addition, Leaky ReLU, MSE loss, Leaky ReLU derivative, dual-port memory.
- Instruction Set: 94-bit wide ISA controlling data transfer and TPU interaction.
- Setup: Requires cocotb, iverilog, and gtkwave for development and testing.
- Adding a new module involves creating SV files, test files, and updating Makefile.
- Future steps include compiler development and scaling TPU to larger dimensions.
- Project is open-source, aimed at helping beginners break into hardware design.