Hasty Briefsbeta

Continuously Augmented Discrete Diffusion Model

17 hours ago
  • #generative modeling
  • #diffusion models
  • #machine learning
  • Standard discrete diffusion models treat all unobserved states identically by mapping them to an [MASK] token, creating an 'information void'.
  • Continuously Augmented Discrete Diffusion (CADD) introduces a paired diffusion in a continuous latent space to augment the discrete state space.
  • CADD represents masked tokens with noisy yet informative latent vectors instead of collapsed 'information voids'.
  • The continuous latent in CADD serves as a semantic hint to guide discrete denoising at each reverse step.
  • CADD allows a controlled trade-off between mode-coverage (diverse outputs) and mode-seeking (precise outputs) behaviors during sampling.
  • Empirical results show CADD improves generative quality over mask-based diffusion in text generation, image synthesis, and code modeling.