Researchers Found a Better Way to Teach Large Language Models New Skills

a year ago

Researchers developed WeGeFT, a technique improving large language models' performance without additional computational power.
WeGeFT enhances model performance in tasks like commonsense reasoning, arithmetic reasoning, and code generation.
The technique builds on LoRA (2022) but adds mathematical tools to identify and prioritize novel parameters for learning.
Proof-of-concept testing shows WeGeFT matches or outperforms LoRA and its variants across various tasks.
Future work explores using WeGeFT to identify harmful outputs and improve AI alignment and safety.
The research will be presented at the International Conference on Machine Learning in July 2024.

Hasty Briefsbeta