Reducto releases Deep Extract
9 hours ago
- #structured_extraction
- #AI_agent
- #document_processing
- Deep Extract is a new agent harness approach for structured extraction that verifies and corrects its own output until results are accurate.
- It addresses challenges in existing extraction pipelines, which often fail on long, complex documents like invoices or financial statements, by using an agent-in-the-loop instead of human reviewers.
- This method breaks down complex documents into smaller tasks handled by sub-agents, allowing it to achieve 99–100% field accuracy, even outperforming human labelers in some cases.
- Deep Extract supports citations with granular bounding boxes for extracted fields, which is crucial for audit trails and traceability in documents such as invoices or financial statements.
- In beta testing, it significantly improved field accuracy from 10-20% with other models to 99-100% on real-world documents like payment reports, exchange positions, and agricultural invoices.
- While it takes longer than standard extraction calls, it is faster, cheaper, and more consistent than manual reviews, especially for documents with thousands of pages.
- Users can enable Deep Extract via the Extract endpoint by setting deep_extract: true and optionally adding verification criteria in system prompts.