Genome modelling and design across all domains of life with Evo 2 - PubMed
2 months ago
- #DNA sequencing
- #genome modelling
- #artificial intelligence
- Evo 2 is a biological foundation model trained on 9 trillion DNA base pairs from a highly curated genomic atlas.
- It has a 1 million token context window with single-nucleotide resolution.
- Evo 2 can predict functional impacts of genetic variations, including noncoding pathogenic mutations and clinically significant BRCA1 variants, without task-specific fine-tuning.
- The model learns representations associated with biological features like exon-intron boundaries, transcription factor binding sites, and protein structural elements.
- Evo 2 generates mitochondrial, prokaryotic, and eukaryotic sequences at genome scale with greater naturalness and coherence than previous methods.
- It can also generate experimentally validated chromatin accessibility patterns.
- The model and related resources, including parameters, training code, and the OpenGenome2 dataset, are made fully open to accelerate biological exploration and design.
- Several authors have disclosed competing interests, including affiliations with biotech companies and advisory roles.