Hasty Briefsbeta

Show HN: Deterministic PCIe Diagnostics for GPUs on Linux

3 days ago
  • #GPU
  • #PCIe
  • #diagnostics
  • A deterministic command-line tool for validating GPU PCIe link health, bandwidth, and real-world PCIe utilization using only observable hardware data.
  • The tool measures PCIe current and maximum link generation and width, peak copy bandwidth, sustained PCIe utilization under load, and efficiency relative to theoretical PCIe payload bandwidth.
  • Provides clear verdicts based on observable conditions: OK, DEGRADED, or UNDERPERFORMING.
  • Diagnoses common PCIe issues like link negotiation problems, generation downgrades, and reduced bandwidth.
  • Requires NVIDIA GPU with supported driver, CUDA Toolkit, NVML development library, and Linux OS.
  • Supports logging in CSV and JSON formats for time-series analysis and automated monitoring.
  • Includes multi-GPU mode for independent evaluation of each GPU.
  • Does not modify BIOS, firmware, registry, or PCIe configuration; reports observable facts only.
  • Open-source under MIT License, authored by Joe McLaren.