Introducing Gemma 3n

10 months ago

Gemma 3n is now fully released, building on the success of the Gemma ecosystem with over 160 million downloads.
Gemma 3n features a mobile-first architecture, supported by tools like Hugging Face Transformers, llama.cpp, and Google AI Edge.
The model introduces MatFormer architecture, enabling elastic inference with nested transformer components.
Developers can use pre-extracted models (E4B or E2B) or create custom-sized models using Mix-n-Match.
Per-Layer Embeddings (PLE) improve model quality without increasing high-speed memory footprint.
KV Cache Sharing accelerates processing of long inputs, improving prefill performance by 2x.
Gemma 3n includes an advanced audio encoder based on Universal Speech Model (USM) for granular audio representation.
A new vision encoder, MobileNet-V5-300M, offers state-of-the-art performance for multimodal tasks on edge devices.
Gemma 3n is supported by a broad ecosystem of tools and platforms, including contributions from AMD, Hugging Face, and NVIDIA.
The Gemma 3n Impact Challenge invites developers to build impactful products with $150,000 in prizes.

Hasty Briefsbeta