Hasty Briefsbeta

Bilingual

I asked my local LLM to add 23 numbers and got seven wrong answers

6 hours ago
  • #local AI setup
  • #code execution
  • #LLM limitations
  • The author attempted to use a local LLM for summing 23 stock transactions, encountering multiple failures.
  • Smaller models omitted data and computed incorrectly, while larger models still failed at arithmetic due to token prediction instead of calculation.
  • Tool-calling systems like Open Interpreter failed due to format mismatches, not executing code as expected.
  • Enabling a harness with code execution (Open WebUI's Code Interpreter) eventually produced the correct answer with clear prompts.
  • Success requires four layers: model, inference engine, orchestrator, and harness, with the harness providing reliability through tools like code execution.
  • Local LLMs need proper harnesses with code execution for computational tasks; chat interfaces alone are insufficient.