Hasty Briefsbeta

Bilingual

Gt: [experimental] multiplexing tensor framework

5 months ago
  • #gpu
  • #machine-learning
  • #distributed-computing
  • GT is an experimental multiplexing tensor framework for distributed GPU computing.
  • It rejects the clunky lock-step paradigm used in ML research, embracing dynamic scheduling and asynchronous execution.
  • GT consists of three components: clients (users), dispatcher (coordinator), and workers (one per GPU).
  • Clients emit pure functional instructions, which the dispatcher rewrites to be GPU-aware and sends to workers.
  • Workers asynchronously process instructions, optionally JIT compiling.
  • Instruction streams are annotated with signals for sharding and hot paths for JIT hints.
  • YAML configs supplement annotations for sharding and compilation, but annotations can be safely ignored.
  • GT automatically spins up an asynchronous dispatching server and GPU worker in the background.
  • Features include high-performance transport (ZeroMQ), autograd support, PyTorch-compatible API, and signal-based sharding.
  • Additional features: real-time monitoring, instruction logging, AI-assisted development, and comprehensive documentation.
  • Installation is via pip, and usage includes auto-server mode, tensor operations, autograd, and signal-based sharding.
  • Examples demonstrate basic tensor operations, signal-based sharding, compilation directives, debug utilities, and visualization.
  • GT is designed for simplicity, readability, and collaboration with AI coding assistants.
  • Contributions are welcome, with detailed guidelines provided.
  • Licensed under MIT.