Qwen-Image-Layered: transparency and layer aware open diffusion model
2 days ago
- #image editing
- #diffusion models
- #image decomposition
- Qwen-Image-Layered decomposes images into semantically disentangled RGBA layers using a diffusion model.
- Enables independent editing of each layer, improving decomposition quality and consistency.
- Addresses the challenge of consistency in image editing faced by recent visual generative models.
- Introduces three key components: RGBA-VAE, VLD-MMDiT architecture, and Multi-stage Training strategy.
- Builds a pipeline to extract and annotate multilayer images from Photoshop documents (PSD) due to scarcity of training data.
- Demonstrates superior performance in decomposition quality and establishes a new paradigm for consistent image editing.
- Code and models are released on GitHub.