Apple shows how much faster the M5 runs local LLMs compared to the M4

20 hours ago

Copy Link

Apple's M5 chip shows significant improvements over the M4 in running local LLMs, with performance boosts of 19-27%.
MLX is an open-source framework by Apple for efficient machine learning on Apple silicon, supporting neural network training and inference.
MLX LM allows developers to run Hugging Face models locally on Apple silicon Macs, including support for quantization to reduce memory usage.
The M5's new GPU Neural Accelerators enhance matrix-multiplication operations, crucial for machine learning workloads.
Apple compared the M4 and M5 in generating tokens for various models, highlighting the M5's superior memory bandwidth (153GB/s vs. M4's 120GB/s).
Image generation on the M5 is more than 3.8x faster than on the M4.
MLX leverages Apple silicon's unified memory architecture, allowing operations to run on CPU or GPU without moving memory.

Hasty Briefsbeta