Gt: [experimental] multiplexing tensor framework

6 months ago

GT is an experimental multiplexing tensor framework for distributed GPU computing.
It rejects the clunky lock-step paradigm used in ML research, embracing dynamic scheduling and asynchronous execution.
GT consists of three components: clients (users), dispatcher (coordinator), and workers (one per GPU).
Clients emit pure functional instructions, which the dispatcher rewrites to be GPU-aware and sends to workers.
Workers asynchronously process instructions, optionally JIT compiling.
Instruction streams are annotated with signals for sharding and hot paths for JIT hints.
YAML configs supplement annotations for sharding and compilation, but annotations can be safely ignored.
GT automatically spins up an asynchronous dispatching server and GPU worker in the background.
Features include high-performance transport (ZeroMQ), autograd support, PyTorch-compatible API, and signal-based sharding.
Additional features: real-time monitoring, instruction logging, AI-assisted development, and comprehensive documentation.
Installation is via pip, and usage includes auto-server mode, tensor operations, autograd, and signal-based sharding.
Examples demonstrate basic tensor operations, signal-based sharding, compilation directives, debug utilities, and visualization.
GT is designed for simplicity, readability, and collaboration with AI coding assistants.
Contributions are welcome, with detailed guidelines provided.
Licensed under MIT.

Hasty Briefsbeta