Hasty Briefsbeta

Bilingual

April 2026 TLDR Setup for Ollama and Gemma 4 26B on a Mac mini

4 hours ago
  • #macOS AI Deployment
  • #Gemma 4 26B Setup
  • #Ollama Installation
  • System requirements: Mac mini with Apple Silicon (M1-M5) and at least 24GB unified memory for running Gemma 4 26B.
  • Install Ollama via Homebrew cask, which includes auto-updates and MLX backend support for GPU acceleration on Apple Silicon.
  • Pull and run the Gemma 4 26B model (~17GB download) and verify GPU usage with commands like 'ollama ps'.
  • Set up auto-launch via Login Items and create a launch agent to preload the model and keep it warm in memory using interval prompts.
  • Adjust Ollama's environment variable OLLAMA_KEEP_ALIVE to '-1' to prevent model unloading and persist settings across reboots.
  • Use Ollama's local API at http://localhost:11434 for OpenAI-compatible chat completions with coding agents.
  • Key commands include listing models, running interactive chats, stopping/unloading models, updating, and deleting models.
  • Ollama leverages MLX for acceleration and NVFP4 format for efficiency, with features like cache reuse and intelligent checkpoints to optimize performance.
  • Memory considerations: Gemma 4 26B uses ~20GB loaded; on a 24GB Mac mini, close memory-heavy apps to ensure system stability.
  • Refer to resources like Ollama's blog post, v0.20.0 release notes, and Google DeepMind's Gemma 4 announcement for updates.