My first impressions on ROCm and Strix Halo

7 hours ago

ROCm setup on Strix Halo with 128GB shared CPU-GPU memory using Ubuntu 24.04 LTS and official drivers.
Required BIOS update for PyTorch to detect GPU and settings adjustment to reduce reserved video memory to 512MB, enabling efficient GTT memory sharing.
Modified GRUB with parameters like ttm.pages_limit and amdgpu.gttsize to optimize memory allocation, leaving 4-12GB for CPU kernel stability.
Configured PyTorch with UV using a custom dependency setup and alias for easy activation with ROCm 7.2 support.
Ran Llama.cpp via Podman for Qwen3.6 model inference with ROCm graphics devices, using flash attention and large context windows.
Used Opencode with a local provider configuration to interface with Llama.cpp server for AI-driven tasks.
Overall positive experience despite setup complexities, successfully running PyTorch and large language models.

Hasty Briefsbeta