Show HN: Sweep, Open-weights 1.5B model for next-edit autocomplete

4 months ago

Sweep Next-Edit 1.5B is a model for next-edit autocomplete, quantized to Q8_0 GGUF format.
It predicts your next code edit before you make it, running locally on your laptop in under 500ms with speculative decoding.
Outperforms models over 4x its size on next-edit benchmarks.
Usage involves downloading run_model.py and the model file, then installing dependencies via pip.
Model details: GGUF format (Q8_0 quantization), 1.5B parameters, 8192 token context length, based on Qwen2.5-Coder.
Includes a specific prompt format with file context, recent diffs, and current state for predictions.
Links provided for a blog post with technical details and benchmarks, and a JetBrains Plugin.
Licensed under Apache 2.0, with 21 downloads last month.
Hardware compatibility includes 8-bit inference, but no Inference Provider support currently.

Hasty Briefsbeta