Universal pre-training by iterated random computation
10 months ago
- #Machine Learning
- #Pre-training
- #Algorithmic Complexity
- Explores using randomly generated data for model pre-training.
- Theoretical justification based on algorithmic complexity and Solomonoff induction.
- Empirical evidence shows synthetic data pre-training enables zero-shot learning.
- Performance improves with model scale and extends to real-world data.
- Finetuning post pre-training enhances convergence and generalization.