Chess Llama – Training a tiny Llama model to play chess

10 months ago

Chess Llama is a tiny Llama model trained to play chess, inspired by Chess GPT.
The model is based on the Llama 3 architecture and trained on 3 million games from the Lichess Elite database (2019-2023).
UCI notation is used for representing games, differing from Chess GPT's PGN notation.
Chess Llama has a vocabulary of 1974 tokens, each representing a single move in UCI notation.
Training details: 5 epochs, batch size 16, 18 hours on an Nvidia L4 GPU via Google Cloud's Vertex AI.
Model performance: Elo rating between 1350-1400, with 99.1% legal moves.
Chess Llama outperforms Stockfish 0 but lags behind higher-level Stockfish configurations.
Interactive demo available via Transformers.js, with adjustable sampling for skill level.
Future work may involve analyzing how the model tracks board state.

Hasty Briefsbeta