Hasty Briefsbeta

Bilingual

How we index images for RAG

7 hours ago
  • #RAG
  • #Image Indexing
  • #Technical Documentation
  • Index images for RAG by describing them at indexing time with a cheap vision model, storing descriptions as text, and retrieving them alongside text chunks.
  • Images in technical documentation serve as illustrative (clarifying text) or load-bearing (containing essential information), both improving answer quality significantly.
  • Query-time multimodal approaches are economically and technically infeasible due to high costs, payload limits, and poor retrieval performance for technical details.
  • A production pipeline involves filtering junk images, captioning with context-aware models, and storing captions as separate chunks to optimize costs and relevance.
  • Results show images cited in 10% to 64% of answers, significant quality improvement, low per-query cost increase (1-6%), and high accuracy in image placement.