Hasty Briefsbeta

Bilingual

Genome modelling and design across all domains of life with Evo 2 - PubMed

2 months ago
  • #DNA sequencing
  • #genome modelling
  • #artificial intelligence
  • Evo 2 is a biological foundation model trained on 9 trillion DNA base pairs from a highly curated genomic atlas.
  • It has a 1 million token context window with single-nucleotide resolution.
  • Evo 2 can predict functional impacts of genetic variations, including noncoding pathogenic mutations and clinically significant BRCA1 variants, without task-specific fine-tuning.
  • The model learns representations associated with biological features like exon-intron boundaries, transcription factor binding sites, and protein structural elements.
  • Evo 2 generates mitochondrial, prokaryotic, and eukaryotic sequences at genome scale with greater naturalness and coherence than previous methods.
  • It can also generate experimentally validated chromatin accessibility patterns.
  • The model and related resources, including parameters, training code, and the OpenGenome2 dataset, are made fully open to accelerate biological exploration and design.
  • Several authors have disclosed competing interests, including affiliations with biotech companies and advisory roles.