Hasty Briefsbeta

Apex GPU: Run CUDA Apps on AMD GPUs Without Recompilation

7 days ago
  • #CUDA
  • #AMD
  • #GPU-Computing
  • APEX GPU enables running unmodified CUDA applications on AMD GPUs using LD_PRELOAD without recompilation.
  • It translates CUDA calls to AMD equivalents at runtime, covering core operations like memory management, streams, events, and kernels.
  • Supports 38 CUDA functions, 15+ cuBLAS operations, and 8+ cuDNN operations for neural networks.
  • Requires AMD GPU (RDNA2/RDNA3 or CDNA series) with ROCm 5.0+ on Linux.
  • Minimal overhead (<1% for typical workloads) and production-ready with a 100% test pass rate.
  • Includes bridges for HIP, cuBLAS, and cuDNN, each with a small footprint (40KB, 22KB, 31KB respectively).
  • Works with popular frameworks like PyTorch and TensorFlow without code changes.
  • Licensed under CC BY-NC-SA 4.0 for non-commercial use; commercial licenses available upon request.
  • Future roadmap includes support for CUDA Driver API, unified memory, and performance profiling tools.
  • Encourages community contributions for testing, adding missing functions, and improving documentation.