Running local models is good now
10 hours ago
- #gemma-models
- #local-ai
- #agentic-coding
- The author finds local AI models have become surprisingly effective for various tasks, moving beyond simple lookup functions to agentic coding.
- Key local models mentioned include Mistral 7B, Gemma 3, GPT-OSS-20B, and Qwen variants, run through setups like llama.cpp, Ollama, and LM Studio.
- Gemma-4 models, particularly the 26B and 12B-QAT versions, enable local agentic workflows with about 75% the accuracy/speed of frontier models.
- A setup using Pi as an agent harness and LM Studio as an inference server is detailed, with Docker for security and configuration tweaks.
- Benefits of local models include introspectability (e.g., token inference, context window adjustments) and customization, despite challenges like speed and hardware limits.