Microsoft researchers developed a hyper-efficient AI model that can run on CPUs

a year ago

Microsoft researchers developed BitNet b1.58 2B4T, the largest-scale 1-bit AI model to date.
BitNet b1.58 2B4T is openly available under an MIT license and can run on CPUs, including Apple’s M2.
Bitnets quantize weights into three values (-1, 0, 1), making them memory- and computing-efficient.
BitNet b1.58 2B4T has 2 billion parameters and was trained on 4 trillion tokens (~33 million books).
The model outperforms similar-sized models like Meta’s Llama 3.2 1B, Google’s Gemma 3 1B, and Alibaba’s Qwen 2.5 1.5B on benchmarks.
BitNet b1.58 2B4T is faster and uses less memory than other models of its size.
Performance requires Microsoft’s custom framework, bitnet.cpp, which currently lacks GPU support.
Bitnets show promise for resource-constrained devices but face compatibility challenges.

Hasty Briefsbeta