The Era of Exploration

a year ago

Large language models (LLMs) are built on decades of freely accessible online text, but high-quality English text may be exhausted by the decade's end due to rapid consumption rates.
The 'Era of Experience' is approaching, where AI progress will depend on models generating their own data, emphasizing the need for the right kind of experience to benefit learning.
Exploration in AI is crucial for collecting diverse and informative data, which is more important than merely increasing model parameters.
Pretraining acts as an 'exploration tax,' providing models with a rich sampling distribution that smaller models can inherit through distillation.
Effective reinforcement learning (RL) requires coverage—generating enough good trajectories during exploration to reinforce learning.
Generalization in AI, especially for LLMs, depends on data diversity, which exploration directly influences by varying the data collected.
Exploration in AI involves two axes: world sampling (deciding where to learn) and path sampling (deciding how to gather data within a world).
Balancing resources between world sampling and path sampling is essential to avoid overfitting and maximize information per flop.
Future AI progress may hinge on scaling exploration, including better world sampling and more intelligent path sampling, to stretch computational resources further.

Hasty Briefsbeta