Hasty Briefsbeta

Bilingual

Speculative Speculative Decoding (SSD)

5 hours ago
  • #Autoregressive Decoding
  • #Machine Learning
  • #Inference Acceleration
  • Introduces speculative speculative decoding (SSD) to parallelize operations in autoregressive decoding.
  • Uses a draft model to predict verification outcomes and prepare speculations pre-emptively.
  • Presents Saguaro, an optimized SSD algorithm, achieving up to 2x speedup over speculative decoding baselines.
  • Addresses three key challenges in speculative speculative decoding with principled solutions.