- Neurometric focuses on auto-generating Small Language Models (SLMs) for specific tasks.
- CRMArena benchmark tests models on realistic Salesforce CRM tasks like lead qualification and activity prioritization.
- Fine-tuned a 4B parameter Qwen model to outperform larger models on CRM tasks with 95% accuracy.
- Initial attempts to teach SLMs to generate SQL queries were rough but improved with expanded training data.
- Phase II involved direct answer generation using the BANT framework, achieving an evaluation score of 0.825.
- Key takeaways: SLMs can outperform larger models with task-specific fine-tuning, synthetic data has quality challenges, and constrained answer spaces improve results.