Why are your models so big? (2023)
9 days ago
- #Model Efficiency
- #LLMs
- #Inference Cost
- LLMs are large to achieve generalizability and human-like responses in chat applications.
- Some applications, like SQL autocomplete or structured extraction, don't require large models due to their tightly-scoped inputs and outputs.
- Inference with large models is expensive in terms of compute and infrastructure.
- The future may see smaller, task-specific models that are more efficient and can even run in the browser.