Hasty Briefsbeta

Bilingual

Show HN: A 150M model that extracts verbatim evidence spans for RAG, no LLM call

7 hours ago
  • #Question Answering
  • #RAG
  • #Natural Language Processing
  • Provides instructions for using KRLabsOrg/verbatim-rag-modern-bert-v2 with libraries, inference providers, notebooks, and local apps.
  • The model is a query-conditioned token classifier that extracts verbatim spans from a passage to answer a question, with a ModernBERT architecture supporting up to 8192 tokens.
  • Trained on a multi-domain dataset including scientific paragraphs, Wikipedia QA, financial tables, medical literature, legal contracts, product manuals, and coding-agent tool output.
  • Can be used as an extraction stage in RAG pipelines to highlight evidence spans before displaying or passing to a generator.
  • Performance metrics show it outperforms public extractive baselines like Zilliz Semantic Highlight and Provence across multiple datasets in Word-F1 scores.