You Don't Need a Vector Database
3 days ago
- #search-api
- #semantic-search
- #vector-database
- Vector databases have become a default solution for search problems, but they may not always be necessary.
- A vector database stores and indexes vectors, which are arrays of numbers representing data meaning, but it doesn't generate or understand the vectors.
- To make a vector database useful, additional components like embedding pipelines, sync mechanisms, and query resolution logic are required.
- Starting with infrastructure (like a vector database) can lead to over-engineering and delays, whereas using a search API can provide quicker results.
- Legitimate use cases for vector databases include ML teams building custom retrieval systems, RAG pipelines with specific requirements, and research/experimentation.
- Most teams don't need a vector database; they need semantic search, content discovery, or multilingual search, which can be achieved more simply with search APIs.
- Building a search stack from a vector database is like building a car from an engine—it requires many additional components to be functional.
- Search APIs like Vecstore handle embedding generation, indexing, retrieval, and ranking internally, simplifying the process and reducing maintenance.
- Vendor lock-in concerns with search APIs are often less severe than with vector databases, as the architecture is simpler and easier to migrate.
- Decision framework: choose a vector database if you need control over embedding models; choose a search API if you need search to work quickly and simply.