Wan – Open-source alternative to VEO 3
7 days ago
- #MoE-architecture
- #AI-models
- #video-generation
- Wan2.2 introduces a Mixture-of-Experts (MoE) architecture into video diffusion models, enhancing model capacity without increasing computational costs.
- The model incorporates meticulously curated aesthetic data for precise and controllable cinematic style generation.
- Wan2.2 is trained on significantly larger data, improving generalization across motions, semantics, and aesthetics.
- It supports efficient high-definition hybrid text-to-video (TI2V) and image-to-video (I2V) generation at 720P resolution with 24fps.
- The model is optimized for consumer-grade graphics cards like the RTX 4090, making it accessible for both industrial and academic use.
- Wan2.2 has been integrated into ComfyUI and Diffusers, with inference code and model weights released for public use.
- The repository provides detailed instructions for single-GPU and multi-GPU inference, including options for reducing GPU memory usage.
- Prompt extension methods are available to enrich video details, using either Dashscope API or local models like Qwen.
- Wan2.2's performance is validated against leading closed-source models, showing superior results in multiple dimensions.
- The models are licensed under Apache 2.0, with no rights claimed over generated content, but users must comply with legal and ethical guidelines.