Hasty Briefsbeta

Bilingual

Llama2.c64: a port of llama2.c to the Commodore C64

a year ago
  • #Commodore64
  • #AI
  • #Llama2
  • Llama2.c64 is a port of llama2.c to the Commodore C64, requiring at least 2MB REU.
  • It runs the 260K tinystories model, simulating a 3-year-old's storytelling ability.
  • Setup involves enabling REU, setting size to 2MB, and loading weights.reu.
  • Commands include 'make build', 'make test', and 'make clean' for building and testing.
  • Exomizer is optional for compressing the program for real hardware use.
  • Advantages include low power consumption, on-premise inference, and data safety.
  • Limitations include slow performance and inability to handle models larger than 8MB.
  • Model preprocessing is done with generate-model-files.py, producing tokenizer.bin, config.bin, and weights.reu.
  • The model runs deterministically with temperature=0.0 and supports top-p sampling.
  • Output tokens appear approximately every 8 minutes, with the first token being a start marker.
  • Optimization is limited, with most time spent in matrix multiplication functions.
  • The program is not suitable for chat but can generate short stories.