Using Local Coding Agents – By Sebastian Raschka, PhD

15 hours ago

#open-weight-llms
#coding-harnesses
#local-coding-agents

A tutorial on setting up a local coding agent using open-source tools and open-weight LLMs as an alternative to proprietary services like Claude Code and Codex.
Local setups offer transparency, inspectability, fixed costs (aside from hardware and electricity), and full user control, with the ability to modify the coding harness.
Key components: a locally served LLM (e.g., Qwen3.6 35B-A3B or North Mini Code) as the reasoning engine, and a coding harness (e.g., Qwen-Code, Codex, Claude Code) providing the operating environment.
Setup involves using Ollama for efficient local model serving, with steps to download models, configure endpoints, and connect to coding harnesses like Qwen-Code via custom providers.
Security considerations include auditing the agent codebase for risks like data egress, file permissions, and prompt injection, and mitigating them with settings (e.g., disabling telemetry).
Performance assessments (speed and capability) are recommended using benchmarks and personal task sets to evaluate model suitability, with Qwen3.6 and North Mini Code performing well.
Harness comparison shows Claude Code uses more tokens than Codex, but all harnesses can be used with local models; Codex may offer better performance in some cases.
Advanced setups include running the model on a separate machine (e.g., DGX Spark) and accessing it via SSH tunnels, or using cloud-hosted open-weight models as an alternative.
Conclusion: Local open-weight models (30-35B range) are capable and sufficient for many tasks, with harness choice based on personal preference and muscle memory.

Hasty Briefsbeta

Using Local Coding Agents – By Sebastian Raschka, PhD