microgpt.py

3 months ago

The code presents a minimalistic implementation of a GPT model in pure Python without dependencies.
It includes tokenization, model architecture (with embeddings, attention, and MLP blocks), and training using Adam optimizer.
The model is trained on a dataset of names and can generate new names after training.
The implementation is educational, showing the core concepts of GPT in under 200 lines of code.
Community contributions include ports to other languages (Rust, 65816 Assembly) and interactive visualizers.

Hasty Briefsbeta