Hasty Briefsbeta

Bilingual

Cursor Composer: Building a fast frontier model with RL

6 months ago
  • #AI
  • #Software Engineering
  • #Reinforcement Learning
  • Composer is a new agent model designed for software engineering intelligence and speed, achieving frontier coding results with generation speed four times faster than similar models.
  • The model is trained to complete real-world software engineering challenges in large codebases, using production search and editing tools to solve diverse problems efficiently.
  • Composer is a mixture-of-experts (MoE) language model specialized for software engineering through reinforcement learning (RL), supporting long-context generation and understanding.
  • The model is evaluated using Cursor Bench, a benchmark measuring usefulness to software developers, including correctness and adherence to codebase practices.
  • Reinforcement learning optimizes the model for interactive development, incentivizing efficient tool use, parallelism, and minimizing unnecessary responses.
  • Training infrastructure leverages PyTorch and Ray for asynchronous RL at scale, using MXFP8 MoE kernels for low-precision training and faster inference.
  • Composer can call various tools in the Cursor Agent harness, requiring hundreds of thousands of concurrent sandboxed coding environments for effective training.
  • The model is already being used by Cursor colleagues for day-to-day software development, aiming to be a valuable tool for users.