Show HN: A 150M model that extracts verbatim evidence spans for RAG, no LLM call

7 hours ago

Provides instructions for using KRLabsOrg/verbatim-rag-modern-bert-v2 with libraries, inference providers, notebooks, and local apps.
The model is a query-conditioned token classifier that extracts verbatim spans from a passage to answer a question, with a ModernBERT architecture supporting up to 8192 tokens.
Trained on a multi-domain dataset including scientific paragraphs, Wikipedia QA, financial tables, medical literature, legal contracts, product manuals, and coding-agent tool output.
Can be used as an extraction stage in RAG pipelines to highlight evidence spans before displaying or passing to a generator.
Performance metrics show it outperforms public extractive baselines like Zilliz Semantic Highlight and Provence across multiple datasets in Word-F1 scores.

Hasty Briefsbeta