Hasty Briefsbeta

Bilingual

Using Local Coding Agents – By Sebastian Raschka, PhD

15 hours ago
  • #open-weight-llms
  • #coding-harnesses
  • #local-coding-agents
  • A tutorial on setting up a local coding agent using open-source tools and open-weight LLMs as an alternative to proprietary services like Claude Code and Codex.
  • Local setups offer transparency, inspectability, fixed costs (aside from hardware and electricity), and full user control, with the ability to modify the coding harness.
  • Key components: a locally served LLM (e.g., Qwen3.6 35B-A3B or North Mini Code) as the reasoning engine, and a coding harness (e.g., Qwen-Code, Codex, Claude Code) providing the operating environment.
  • Setup involves using Ollama for efficient local model serving, with steps to download models, configure endpoints, and connect to coding harnesses like Qwen-Code via custom providers.
  • Security considerations include auditing the agent codebase for risks like data egress, file permissions, and prompt injection, and mitigating them with settings (e.g., disabling telemetry).
  • Performance assessments (speed and capability) are recommended using benchmarks and personal task sets to evaluate model suitability, with Qwen3.6 and North Mini Code performing well.
  • Harness comparison shows Claude Code uses more tokens than Codex, but all harnesses can be used with local models; Codex may offer better performance in some cases.
  • Advanced setups include running the model on a separate machine (e.g., DGX Spark) and accessing it via SSH tunnels, or using cloud-hosted open-weight models as an alternative.
  • Conclusion: Local open-weight models (30-35B range) are capable and sufficient for many tasks, with harness choice based on personal preference and muscle memory.