Recursive Self-Improvement Delivers New SOTA Coding Performance

a day ago

Poetiq's Meta-System automatically creates and optimizes harnesses from scratch, improving performance on LiveCodeBench Pro (LCB Pro) without fine-tuning or special model access.
LCB Pro is a coding benchmark with memory and runtime constraints, designed to mitigate data contamination and test complex problem-solving in C++.
The optimized harness, initially built for Gemini 3.1 Pro, boosts its accuracy by 12.3% (from 78.6% to 90.9%), surpassing GPT 5.5 High and Gemini Deep Think.
The same harness improves other models: GPT 5.5 High reaches 93.9% accuracy, and Gemini 3 Flash sees a 10-point increase, outperforming larger models like Claude Opus 4.7.
Improvements are consistent across difficulty levels (Easy, Medium, Hard), with the harness outperforming base models in every category.
Poetiq's approach is model-agnostic, applying learned harnesses to any LLM via recursive self-improvement, and has also shown success on reasoning and retrieval benchmarks like ARC-AGI and HLE.
The system focuses on automating knowledge extraction for tasks involving reasoning, retrieval, and coding, aiming to enhance AI's economic impact and problem-solving capabilities.

Hasty Briefsbeta