Llama2.c64: a port of llama2.c to the Commodore C64
a year ago
- #Commodore64
- #AI
- #Llama2
- Llama2.c64 is a port of llama2.c to the Commodore C64, requiring at least 2MB REU.
- It runs the 260K tinystories model, simulating a 3-year-old's storytelling ability.
- Setup involves enabling REU, setting size to 2MB, and loading weights.reu.
- Commands include 'make build', 'make test', and 'make clean' for building and testing.
- Exomizer is optional for compressing the program for real hardware use.
- Advantages include low power consumption, on-premise inference, and data safety.
- Limitations include slow performance and inability to handle models larger than 8MB.
- Model preprocessing is done with generate-model-files.py, producing tokenizer.bin, config.bin, and weights.reu.
- The model runs deterministically with temperature=0.0 and supports top-p sampling.
- Output tokens appear approximately every 8 minutes, with the first token being a start marker.
- Optimization is limited, with most time spent in matrix multiplication functions.
- The program is not suitable for chat but can generate short stories.