Chess Llama – Training a tiny Llama model to play chess
9 months ago
- #AI
- #Chess
- #Machine Learning
- Chess Llama is a tiny Llama model trained to play chess, inspired by Chess GPT.
- The model is based on the Llama 3 architecture and trained on 3 million games from the Lichess Elite database (2019-2023).
- UCI notation is used for representing games, differing from Chess GPT's PGN notation.
- Chess Llama has a vocabulary of 1974 tokens, each representing a single move in UCI notation.
- Training details: 5 epochs, batch size 16, 18 hours on an Nvidia L4 GPU via Google Cloud's Vertex AI.
- Model performance: Elo rating between 1350-1400, with 99.1% legal moves.
- Chess Llama outperforms Stockfish 0 but lags behind higher-level Stockfish configurations.
- Interactive demo available via Transformers.js, with adjustable sampling for skill level.
- Future work may involve analyzing how the model tracks board state.