Show HN: Extrai – An open-source tool to fight LLM randomness in data extraction
19 days ago
- #LLM
- #SQLModel
- #data-extraction
- extrai is a library for extracting data from text documents using LLMs, formatted into SQLModel and registered in a database.
- Features a Consensus Mechanism for improved accuracy by consolidating multiple LLM outputs.
- Supports Dynamic SQLModel Generation from natural language descriptions.
- Offers Hierarchical Extraction for complex, nested data by breaking down into manageable steps.
- Includes Extensible LLM Support, Built-in Analytics, and Workflow Orchestration.
- Provides Example JSON Generation and Customizable Prompts for tailored extraction.
- Allows Rotating LLM providers for JSON revisions.
- Key sections in documentation: Getting Started, How-to Guides, Core Concepts, Reference, API Reference, Community.
- Install via pip: `pip install extrai-workflow`.
- Example usage involves defining a data model, setting up an orchestrator, and running extraction.