Event Tensor: A Unified Abstraction for Compiling Dynamic Megakernel
9 hours ago
- #Dynamic Megakernels
- #GPU Computing
- #Compiler Abstraction
- Introduces Event Tensor, a unified compiler abstraction for dynamic megakernels.
- Aims to address kernel launch overheads and coarse synchronization in GPU workloads like LLM inference.
- Encodes dependencies between tiled tasks to support shape and data-dependent dynamism.
- Event Tensor Compiler (ETC) applies static and dynamic scheduling to generate high-performance persistent kernels.
- Evaluation shows ETC achieves state-of-the-art LLM serving latency and reduces system warmup overhead.