Talkie: a 13B vintage language model from 1930
7 hours ago
- #historical text training
- #vintage language model
- #AI generalization
- Introducing talkie-1930-13b, a 13B vintage language model trained on pre-1931 text to simulate historical perspectives.
- Vintage LMs allow contamination-free studies on AI generalization, future prediction, and idea generation beyond training data.
- They help explore the impact of data diversity on model behavior, compared to modern web-trained models.
- Talkie underperforms its modern counterpart on some benchmarks but shows promise in language understanding and numeracy.
- Challenges include ensuring no post-1930 data leakage, improving OCR quality, and creating era-appropriate post-training data.
- Future plans include scaling talkie to GPT-3 and GPT-3.5 levels and developing multilingual corpora.