Achieving 10,000x training data reduction with high-fidelity labels

17 days ago

Copy Link

A new active learning method reduces training data requirements for fine-tuning LLMs by orders of magnitude.
The method focuses on high-fidelity labels to improve model alignment with human experts.
Experiments showed a reduction from 100,000 to under 500 training examples while improving alignment by up to 65%.
The process involves clustering and prioritizing the most confusing examples for expert review.
Cohen’s Kappa is used to measure alignment between model and human experts, with values above 0.8 considered excellent.
Larger models (3.25B parameters) showed significant improvements with curated data, achieving 55-65% better alignment.
The method is scalable and can be applied to datasets with hundreds of billions of examples.
High-quality labels (Kappa > 0.8) are essential for outperforming crowdsourced data.

Hasty Briefsbeta