CUDA Ontology

6 days ago

Copy Link

CUDA terminology is overloaded, referring to multiple distinct concepts like architecture, instruction set, source language, toolkit, and runtime.
The term 'kernel' in CUDA can mean either the operating system kernel (OSkernel) or a GPU function (CUDAkernel).
The term 'driver' in CUDA refers to either the NVIDIA GPU Driver (OSkernel-space) or the CUDA Driver API (user-space).
CUDA's ecosystem is layered, with components like libcudart (Runtime API), libcuda (Driver API), and nvidia.ko (GPU Driver) interacting across OSkernel-space and user-space.
Versioning in CUDA involves multiple independent schemes: compute capability (hardware), GPU driver version, CUDA Toolkit version, Runtime API version, and Driver API version.
CUDA maintains forward compatibility, allowing older frontend versions to run on newer backends, but lacks backward compatibility.
For successful execution, CUDA requires: (1) Driver API version ≥ Runtime API version, and (2) GPU code availability (SASS or PTX for the target GPU).
Common failure modes include cudaErrorInsufficientDriver (version mismatch) and cudaErrorNoKernelImageForDevice (missing GPU code).
Tools like nvidia-smi, nvcc, and torch.version.cuda report different version numbers, each measuring distinct aspects of the CUDA system.
Practical guidelines include specifying minimum driver versions, bundling runtime libraries, and compiling for multiple compute capabilities.

Hasty Briefsbeta