Cursor Composer: Building a fast frontier model with RL

6 months ago

Composer is a new agent model designed for software engineering intelligence and speed, achieving frontier coding results with generation speed four times faster than similar models.
The model is trained to complete real-world software engineering challenges in large codebases, using production search and editing tools to solve diverse problems efficiently.
Composer is a mixture-of-experts (MoE) language model specialized for software engineering through reinforcement learning (RL), supporting long-context generation and understanding.
The model is evaluated using Cursor Bench, a benchmark measuring usefulness to software developers, including correctness and adherence to codebase practices.
Reinforcement learning optimizes the model for interactive development, incentivizing efficient tool use, parallelism, and minimizing unnecessary responses.
Training infrastructure leverages PyTorch and Ray for asynchronous RL at scale, using MXFP8 MoE kernels for low-precision training and faster inference.
Composer can call various tools in the Cursor Agent harness, requiring hundreds of thousands of concurrent sandboxed coding environments for effective training.
The model is already being used by Cursor colleagues for day-to-day software development, aiming to be a valuable tool for users.

Hasty Briefsbeta