April 2026 TLDR Setup for Ollama and Gemma 4 26B on a Mac mini

4 hours ago

System requirements: Mac mini with Apple Silicon (M1-M5) and at least 24GB unified memory for running Gemma 4 26B.
Install Ollama via Homebrew cask, which includes auto-updates and MLX backend support for GPU acceleration on Apple Silicon.
Pull and run the Gemma 4 26B model (~17GB download) and verify GPU usage with commands like 'ollama ps'.
Set up auto-launch via Login Items and create a launch agent to preload the model and keep it warm in memory using interval prompts.
Adjust Ollama's environment variable OLLAMA_KEEP_ALIVE to '-1' to prevent model unloading and persist settings across reboots.
Use Ollama's local API at http://localhost:11434 for OpenAI-compatible chat completions with coding agents.
Key commands include listing models, running interactive chats, stopping/unloading models, updating, and deleting models.
Ollama leverages MLX for acceleration and NVFP4 format for efficiency, with features like cache reuse and intelligent checkpoints to optimize performance.
Memory considerations: Gemma 4 26B uses ~20GB loaded; on a 24GB Mac mini, close memory-heavy apps to ensure system stability.
Refer to resources like Ollama's blog post, v0.20.0 release notes, and Google DeepMind's Gemma 4 announcement for updates.

Hasty Briefsbeta