The Well: 15TB of Physics Simulations
4 months ago
- #dataset
- #machine-learning
- #physics-simulations
- The Well is a large-scale collection of machine learning datasets containing numerical simulations of spatiotemporal physical systems.
- It provides 15TB of data across 16 datasets covering diverse domains like biological systems, fluid dynamics, and magneto-hydrodynamic simulations.
- Datasets can be used individually or as part of a broader benchmark suite for machine learning and computational sciences research.
- Installation can be done via PyPI or from source, with options for different hardware accelerations like CUDA.
- Datasets range from 6.9GB to 5.1TB each, requiring significant disk space.
- Data can be downloaded locally or streamed from Hugging Face, with recommendations for local downloads for better performance.
- Benchmarking tools are included, with state-of-the-art models implemented for surrogate modeling.
- Pre-trained model checkpoints are available on Hugging Face for easy loading and use.
- The project is a collaboration led by Polymathic AI with contributions from various institutions worldwide.
- Users are encouraged to cite the project in their research and can contact the maintainers for questions or issues.