Gemini Diffusion

a year ago

Gemini Diffusion is Google's first LLM using diffusion instead of transformers.
Diffusion models refine noise step-by-step, enabling faster and more coherent outputs.
Gemini Diffusion excels in tasks like editing, math, and code due to its iterative process.
The model boasts speeds of 857 tokens/second, generating interactive HTML+JavaScript quickly.
Performance is comparable to Cerebras Coder running Llama3.1-70b at 2,000 tokens/second.
Google claims it matches Gemini 2.0 Flash-Lite's performance at 5x the speed.
Prior to this, Inception Mercury was the only commercial-grade diffusion model available.

Hasty Briefsbeta