I Put a Datacenter GPU in My Gaming PC for £200

24 days ago

A Tesla V100 SXM2 16GB datacenter GPU was purchased for £150 on eBay to add 16GB VRAM to an existing RTX 4080 setup.
An SXM2-to-PCIe adapter (£50) allowed installation in a gaming PC, overcoming the GPU's lack of standard connectors.
The V100's HBM2 memory provides 900 GB/s bandwidth, outperforming many modern consumer GPUs and Macs.
The adapter's fan initially ran at 82 dB, but was controlled via PWM by wiring it to a motherboard fan header.
Using llama.cpp with tensor splitting, the combined 32GB VRAM runs a 27B parameter model at ~32 tokens/second.
Setup required specific drivers (NVIDIA legacy_535) and kernel (6.6) on NixOS, with models stored on a NAS.
The Qwen3.6-27B-MTP model supports vision input and multi-token prediction, offering competitive performance with cloud models.
A V100 occasionally disappears after warm reboots, requiring a cold reboot to restore functionality.
This £200 solution provides high VRAM for local LLM inference, with alternatives like the V100 32GB or P40 offering more capacity.

Hasty Briefsbeta