Hasty Briefsbeta

Bilingual

Ollama is now powered by MLX on Apple Silicon in preview

5 hours ago
  • #AI Acceleration
  • #Apple Silicon
  • #Machine Learning Framework
  • Ollama previews faster performance on Apple Silicon using Apple's MLX machine learning framework, improving speed for demanding tasks on macOS.
  • MLX leverages unified memory architecture and GPU Neural Accelerators on M5 chips to boost time to first token and tokens per second.
  • NVFP4 format from NVIDIA is integrated to maintain model accuracy while reducing memory bandwidth and storage needs.
  • Enhanced caching system reuses cache across conversations, lowers memory use, and features intelligent checkpoints and smarter eviction.
  • Preview release focuses on Qwen3.5-35B-A3B model for coding tasks, requiring Macs with over 32GB unified memory, with plans to support more models.