So you wanna build a local RAG?

13 days ago

Copy Link

Skald was designed to be self-hostable and operate without sending data to third-parties, addressing privacy concerns for organizations.
A basic RAG setup includes a vector database, vector embeddings model, and an LLM, with optional components like a reranker and document parsing.
Proprietary and open-source alternatives for each RAG component are provided, emphasizing flexibility in building a local setup.
Skald's local stack uses Postgres + pgvector for the vector database, Sentence Transformers for embeddings, and allows user-configurable LLMs and rerankers.
Benchmarking showed that a fully local setup with GPT-OSS 20B performed well, scoring an average of 8.63, with some limitations in handling non-English queries and aggregating information from multiple documents.
Multi-lingual models like bge-m3 and mmarco-mMiniLMv2-L12-H384-v1 improved performance, especially for non-English queries, though challenges remain in information aggregation.
Skald aims to further polish the local setup and publish more benchmarks for open-source models, catering to privacy-sensitive and air-gapped infrastructure needs.

Hasty Briefsbeta