Hasty Briefsbeta

Tiny worlds: A minimal implementation of DeepMind's Genie world model

4 days ago
  • #deep-learning
  • #autoregressive
  • #world-models
  • TinyWorlds is a minimal autoregressive world model based on Google Deepmind's Genie Architecture.
  • It helps understand scalable world models by using an autoregressive, unsupervised method.
  • Installation involves cloning the repository, installing dependencies, and setting up environment variables.
  • Training requires downloading datasets and running a training script with a configuration file.
  • Inference involves pulling pretrained checkpoints and running an inference script.
  • World models map the current state of the environment to the next state, compressing information into laws.
  • TinyWorlds uses discrete tokens for easier dynamics prediction and consists of three modules: Video Tokenizer, Action Tokenizer, and Dynamics Model.
  • Space-Time Transformer (STT) is used for video processing with spatial and temporal attention layers.
  • Variational Autoencoders (VAEs) are used for quantization and tokenization.
  • Action Tokenizer infers actions between frames without prior action data.
  • Dynamics Model predicts future frames based on past tokens and actions.
  • Data is processed into .h5 files and available datasets include PicoDoom, Pong, Zelda, and more.
  • Supports Torch features like compile, DDP, AMP, and TF32 for accelerated training and inference.
  • Future improvements include Mixture of Experts, new optimizers, and scaling to more GPUs.