Continuous Autoregressive Language Models
17 days ago
- #natural-language-processing
- #language-models
- #machine-learning
- Introduces Continuous Autoregressive Language Models (CALM) for more efficient language generation.
- Shifts from discrete next-token prediction to continuous next-vector prediction.
- Uses a high-fidelity autoencoder to compress chunks of tokens into continuous vectors.
- Reduces generative steps by a factor of K, improving computational efficiency.
- Develops a likelihood-free framework for training, evaluation, and controllable sampling.
- Demonstrates significant performance-compute trade-off improvements over discrete baselines.
- Establishes next-vector prediction as a scalable pathway for ultra-efficient language models.