Please Switch to Python. Or R. Or Anything. Just Not Stata, SAS, SPSS, or Matlab
4 hours ago
- #open-source
- #data-science
- #python
- Advocates switching from proprietary statistical tools (Stata, SAS, SPSS, MATLAB) to open-source languages like Python or R.
- Uses a case study with OPM workforce data to highlight Python's advantages: browser automation (Playwright), reproducible workflows, and easy sharing via GitHub and Hugging Face.
- Emphasizes Python's ecosystem for automation, data processing beyond CSV/Excel, interactive visualizations (D3.js), and collaborative infrastructure.
- Criticizes proprietary tools for limiting reproducibility, accessibility, and integration with modern tools like GitHub Actions.
- Notes AI coding assistants (e.g., Claude Code) lower learning curves, enabling focus on architecture and workflows over syntax.
- Points out general benefits: version control, reusable functions, package managers, and handling diverse data sources (PDFs, web scraping).
- Acknowledges niche use cases for proprietary tools (e.g., FDA submissions, Simulink) but argues Python/R cover most needs.
- Concludes that barriers to entry are falling, urging adoption of open-source languages for better collaboration and efficiency.