April 2026 TLDR Setup for Ollama and Gemma 4 26B on a Mac mini
4 hours ago
- #macOS AI Deployment
- #Gemma 4 26B Setup
- #Ollama Installation
- System requirements: Mac mini with Apple Silicon (M1-M5) and at least 24GB unified memory for running Gemma 4 26B.
- Install Ollama via Homebrew cask, which includes auto-updates and MLX backend support for GPU acceleration on Apple Silicon.
- Pull and run the Gemma 4 26B model (~17GB download) and verify GPU usage with commands like 'ollama ps'.
- Set up auto-launch via Login Items and create a launch agent to preload the model and keep it warm in memory using interval prompts.
- Adjust Ollama's environment variable OLLAMA_KEEP_ALIVE to '-1' to prevent model unloading and persist settings across reboots.
- Use Ollama's local API at http://localhost:11434 for OpenAI-compatible chat completions with coding agents.
- Key commands include listing models, running interactive chats, stopping/unloading models, updating, and deleting models.
- Ollama leverages MLX for acceleration and NVFP4 format for efficiency, with features like cache reuse and intelligent checkpoints to optimize performance.
- Memory considerations: Gemma 4 26B uses ~20GB loaded; on a 24GB Mac mini, close memory-heavy apps to ensure system stability.
- Refer to resources like Ollama's blog post, v0.20.0 release notes, and Google DeepMind's Gemma 4 announcement for updates.