Hasty Briefsbeta

Bilingual

Chess Llama – Training a tiny Llama model to play chess

9 months ago
  • #AI
  • #Chess
  • #Machine Learning
  • Chess Llama is a tiny Llama model trained to play chess, inspired by Chess GPT.
  • The model is based on the Llama 3 architecture and trained on 3 million games from the Lichess Elite database (2019-2023).
  • UCI notation is used for representing games, differing from Chess GPT's PGN notation.
  • Chess Llama has a vocabulary of 1974 tokens, each representing a single move in UCI notation.
  • Training details: 5 epochs, batch size 16, 18 hours on an Nvidia L4 GPU via Google Cloud's Vertex AI.
  • Model performance: Elo rating between 1350-1400, with 99.1% legal moves.
  • Chess Llama outperforms Stockfish 0 but lags behind higher-level Stockfish configurations.
  • Interactive demo available via Transformers.js, with adjustable sampling for skill level.
  • Future work may involve analyzing how the model tracks board state.