Hasty Briefsbeta

SimpleFold: Folding Proteins Is Simpler Than You Think

3 hours ago
  • #bioinformatics
  • #machine-learning
  • #protein-folding
  • SimpleFold is a flow-matching based protein folding model using general-purpose transformer layers.
  • It does not rely on expensive modules like triangle attention or pair representation biases.
  • Trained on over 8.6M distilled protein structures and experimental PDB data.
  • Scaled to 3B parameters, making it the largest protein folding model to date.
  • Achieves competitive performance on standard folding benchmarks.
  • Demonstrates strong performance in ensemble prediction due to its generative training objective.
  • Challenges the reliance on complex domain-specific architectures in protein structure prediction.
  • Installation instructions provided for both PyTorch and MLX backends.
  • Includes Jupyter notebook for predicting protein structures from sequences.
  • Provides pre-trained models of various sizes (100M to 3B parameters).
  • Offers evaluation scripts for folding tasks and two-state predictions.
  • Training instructions and dataset processing details included.
  • Code and models come with specific licenses; users are advised to review them.