Hasty Briefsbeta

Bilingual

I Put a Datacenter GPU in My Gaming PC for £200

3 hours ago
  • #hardware hacking
  • #local LLM
  • #datacenter GPU
  • A Tesla V100 SXM2 16GB datacenter GPU was purchased for £150 on eBay to add 16GB VRAM to an existing RTX 4080 setup.
  • An SXM2-to-PCIe adapter (£50) allowed installation in a gaming PC, overcoming the GPU's lack of standard connectors.
  • The V100's HBM2 memory provides 900 GB/s bandwidth, outperforming many modern consumer GPUs and Macs.
  • The adapter's fan initially ran at 82 dB, but was controlled via PWM by wiring it to a motherboard fan header.
  • Using llama.cpp with tensor splitting, the combined 32GB VRAM runs a 27B parameter model at ~32 tokens/second.
  • Setup required specific drivers (NVIDIA legacy_535) and kernel (6.6) on NixOS, with models stored on a NAS.
  • The Qwen3.6-27B-MTP model supports vision input and multi-token prediction, offering competitive performance with cloud models.
  • A V100 occasionally disappears after warm reboots, requiring a cold reboot to restore functionality.
  • This £200 solution provides high VRAM for local LLM inference, with alternatives like the V100 32GB or P40 offering more capacity.