Hasty Briefsbeta

Show HN: RAG-chunk – A CLI to test RAG chunking strategies

7 days ago
  • #RAG
  • #Markdown
  • #CLI
  • CLI tool for parsing, chunking, and evaluating Markdown documents for RAG preparation.
  • Available on PyPI: https://pypi.org/project/rag-chunk/.
  • Features include parsing and cleaning Markdown files, three chunking strategies (fixed-size, sliding-window, paragraph), recall-based evaluation, and multiple output formats (table/JSON/CSV).
  • Installation via pip: `pip install rag-chunk` or `pip install -e .` for development mode.
  • Supports analysis of markdown files with various strategies and parameters.
  • Evaluation based on a test JSON file with questions and relevant phrases to measure recall.
  • Future plans include support for tiktoken for precise token-based chunking, more strategies, and additional file types.
  • Project structure includes src, tests, examples, and .chunks directories.
  • MIT licensed.