Hasty Briefsbeta

Why are your models so big? (2023)

9 days ago

https://pawa.lt/braindump/tiny-models/

Copy Link

#Model Efficiency
#LLMs
#Inference Cost

LLMs are large to achieve generalizability and human-like responses in chat applications.
Some applications, like SQL autocomplete or structured extraction, don't require large models due to their tightly-scoped inputs and outputs.
Inference with large models is expensive in terms of compute and infrastructure.
The future may see smaller, task-specific models that are more efficient and can even run in the browser.