Introducing Gemma 3n
10 months ago
- #Gemma 3n
- #AI
- #on-device AI
- Gemma 3n is now fully released, building on the success of the Gemma ecosystem with over 160 million downloads.
- Gemma 3n features a mobile-first architecture, supported by tools like Hugging Face Transformers, llama.cpp, and Google AI Edge.
- The model introduces MatFormer architecture, enabling elastic inference with nested transformer components.
- Developers can use pre-extracted models (E4B or E2B) or create custom-sized models using Mix-n-Match.
- Per-Layer Embeddings (PLE) improve model quality without increasing high-speed memory footprint.
- KV Cache Sharing accelerates processing of long inputs, improving prefill performance by 2x.
- Gemma 3n includes an advanced audio encoder based on Universal Speech Model (USM) for granular audio representation.
- A new vision encoder, MobileNet-V5-300M, offers state-of-the-art performance for multimodal tasks on edge devices.
- Gemma 3n is supported by a broad ecosystem of tools and platforms, including contributions from AMD, Hugging Face, and NVIDIA.
- The Gemma 3n Impact Challenge invites developers to build impactful products with $150,000 in prizes.