Hasty Briefsbeta

Bilingual

Setting Up a Cluster of Tiny PCs for Parallel Computing

2 months ago
  • #parallel computing
  • #cluster setup
  • #R programming
  • Setting up a cluster of tiny PCs for parallel computing involves installing Ubuntu, configuring passwordless SSH, and automating package installations across nodes.
  • The project aimed to distribute R simulations efficiently, comparing performance between CV5 and CV10 cross-validation methods.
  • Key steps included selecting affordable PCs like Lenovo M715q, installing Ubuntu Server, and configuring network settings for fixed IPs.
  • Passwordless SSH and sudo were set up to facilitate seamless command execution across nodes without manual password entry.
  • A template R script was created to automate simulations, leveraging multicore processing on each node to minimize network overhead.
  • The setup demonstrated significant time savings, with some simulations running up to three times faster on three nodes compared to a single quad-core machine.
  • Analysis showed that increasing CV folds from 5 to 10 reduced bias but slightly increased variance, with tuned xgboost + logistic regression performing best in terms of coverage and bias.
  • Opportunities for improvement include developing a package for easier setup, implementing notifications for task completion, and learning OpenMPI for more advanced parallel computing.
  • Lessons learned include the effectiveness of `future.seed` for reproducibility in parallel processing and the importance of asymmetrical coverage assessment in method evaluation.