SimpleFold: Folding Proteins Is Simpler Than You Think
3 hours ago
- #bioinformatics
- #machine-learning
- #protein-folding
- SimpleFold is a flow-matching based protein folding model using general-purpose transformer layers.
- It does not rely on expensive modules like triangle attention or pair representation biases.
- Trained on over 8.6M distilled protein structures and experimental PDB data.
- Scaled to 3B parameters, making it the largest protein folding model to date.
- Achieves competitive performance on standard folding benchmarks.
- Demonstrates strong performance in ensemble prediction due to its generative training objective.
- Challenges the reliance on complex domain-specific architectures in protein structure prediction.
- Installation instructions provided for both PyTorch and MLX backends.
- Includes Jupyter notebook for predicting protein structures from sequences.
- Provides pre-trained models of various sizes (100M to 3B parameters).
- Offers evaluation scripts for folding tasks and two-state predictions.
- Training instructions and dataset processing details included.
- Code and models come with specific licenses; users are advised to review them.