Hasty Briefsbeta

Bilingual

Reducto releases Deep Extract

7 hours ago
  • #structured_extraction
  • #AI_agent
  • #document_processing
  • Deep Extract is a new agent harness approach for structured extraction that verifies and corrects its own output until results are accurate.
  • It addresses challenges in existing extraction pipelines, which often fail on long, complex documents like invoices or financial statements, by using an agent-in-the-loop instead of human reviewers.
  • This method breaks down complex documents into smaller tasks handled by sub-agents, allowing it to achieve 99–100% field accuracy, even outperforming human labelers in some cases.
  • Deep Extract supports citations with granular bounding boxes for extracted fields, which is crucial for audit trails and traceability in documents such as invoices or financial statements.
  • In beta testing, it significantly improved field accuracy from 10-20% with other models to 99-100% on real-world documents like payment reports, exchange positions, and agricultural invoices.
  • While it takes longer than standard extraction calls, it is faster, cheaper, and more consistent than manual reviews, especially for documents with thousands of pages.
  • Users can enable Deep Extract via the Extract endpoint by setting deep_extract: true and optionally adding verification criteria in system prompts.