Show HN: A 150M model that extracts verbatim evidence spans for RAG, no LLM call
7 hours ago
- #Question Answering
- #RAG
- #Natural Language Processing
- Provides instructions for using KRLabsOrg/verbatim-rag-modern-bert-v2 with libraries, inference providers, notebooks, and local apps.
- The model is a query-conditioned token classifier that extracts verbatim spans from a passage to answer a question, with a ModernBERT architecture supporting up to 8192 tokens.
- Trained on a multi-domain dataset including scientific paragraphs, Wikipedia QA, financial tables, medical literature, legal contracts, product manuals, and coding-agent tool output.
- Can be used as an extraction stage in RAG pipelines to highlight evidence spans before displaying or passing to a generator.
- Performance metrics show it outperforms public extractive baselines like Zilliz Semantic Highlight and Provence across multiple datasets in Word-F1 scores.