Hasty Briefsbeta

Bilingual

Stop Using Ollama

7 hours ago
  • #Local LLMs
  • #Ollama Controversy
  • #Open Source Ethics
  • Ollama started as an easy wrapper for llama.cpp, making local LLMs accessible but later obscured its reliance on the underlying technology.
  • The project failed to properly credit llama.cpp for over a year, ignoring MIT license requirements by not including the copyright notice.
  • Ollama forked and replaced llama.cpp with a custom backend that introduced bugs and performed worse, with benchmarks showing significantly slower inference speeds.
  • Misleading model naming, such as listing distilled versions as the full model (e.g., DeepSeek-R1), caused confusion and reputational damage to model creators.
  • The introduction of a closed-source desktop app and proprietary Modelfile system created vendor lock-in and added unnecessary complexity compared to single-file GGUF.
  • Ollama's registry bottleneck delays new model availability and limits quantization options, forcing users to wait or use other tools for community-quantized models.
  • A pivot to cloud-hosted models raised privacy concerns, with vulnerabilities like CVE-2025-51471 exposing tokens and unclear data handling by third-party providers.
  • Venture capital incentives drove decisions toward monetization, lock-in, and reduced transparency, straying from the local-first mission.
  • Alternatives like llama.cpp (with its API server), LM Studio, Jan, and llama-swap offer better performance, openness, and ease of use without Ollama's drawbacks.