CodeScientist: Automated scientific discovery system for code-based experiments
a year ago
- #automated-experimentation
- #LLM
- #scientific-discovery
- CodeScientist is an end-to-end semi-automated scientific discovery system that designs, iterates, and analyzes scientific experiments expressed as Python code.
- It generates novel ideas by using genetic mutations (LLM-as-a-mutator paradigm) to combine scientific articles and code examples.
- The system includes an Experiment Builder that automatically creates, runs, and debugs experiment code in a container, then writes a report on the results.
- CodeScientist can run in two modes: Human-in-the-loop (primary mode) and Fully-automatic (less efficient).
- The repository includes open-source software, installation instructions, reports, raw data (experiment code, logs, ideas, etc.), and example papers.
- Key features include ideation from papers, experiment creation (manual/automatic), batch autonomous experiments, meta-analysis, and integration with external systems.
- Installation requires Ubuntu Linux (or MacOS), Python 3.12, a Latex distribution, and a Modal.com account for cloud container execution.
- Users can add papers and codeblocks for specialized domains to extend CodeScientist's capabilities.
- The system includes cost estimation tools to manage LLM API usage, with hard limits to prevent runaway costs.
- Example experiments demonstrate CodeScientist's ability to test hypotheses, such as LLM confidence vs. accuracy, state complexity effects, and knowledge graph agents.