Hasty Briefsbeta

Bilingual

When does fragmentation occur in the CUDA caching allocator?

3 days ago
  • #Fragmentation
  • #CUDA
  • #Memory Allocation
  • Fragmentation occurs in the CUDA caching allocator when allocation patterns prevent merging of free memory blocks, even if total free memory is sufficient.
  • Without expandable segments, each allocation triggers separate cudaMalloc calls, creating independent segments that cannot merge across boundaries.
  • Expandable segments use CUDA's virtual memory API to create contiguous virtual address ranges, allowing blocks within a segment to merge when freed, reducing fragmentation.
  • Allocation order matters without expandable segments: allocating small then large blocks can cause fragmentation, while large then small allocations can be more efficient.
  • Expandable segments mitigate fragmentation by merging all freed blocks into a single contiguous free space, but fragmentation can still occur if live allocations block merging.
  • A 1 MiB boundary separates small and large block pools; crossing this boundary prevents memory sharing between pools, even with expandable segments.