Retrievers
What Are Retrievers?
Concept
Retrievers are your AI search engine for large text collections or knowledge bases. They let you find the most relevant information based on a query—using either advanced embeddings (semantic search) or classic keyword matching.
Want a deep dive? Check out the RAG Cookbook for advanced agent + retrieval use cases.
Types of Retrievers
Vector Retriever
Converts documents into vectors using an embedding model, stores them, and retrieves by semantic similarity. Best for “meaning-based” search, RAG, and LLM workflows.
- Chunks data, embeds with OpenAI or custom model
- Stores in vector DB (like Qdrant)
- Finds the most relevant info even with different wording
Keyword Retriever
Classic keyword search! Breaks documents and queries into tokens/keywords, and matches on those.
- Tokenizes documents
- Indexes by keyword
- Fast, transparent, great for exact matches
How To Use
Semantic Retrieval with VectorRetriever
This example uses OpenAI embeddings and Qdrant vector storage for semantic search.
AutoRetriever: Quick RAG with One Call
AutoRetriever simplifies everything: just specify storage and content, and it handles embedding, storage, and querying.
Use AutoRetriever for fast experiments and RAG workflows; for advanced control, use VectorRetriever directly.
KeywordRetriever (Classic Search)
For simple, blazing-fast search by keyword—use KeywordRetriever.
Great for small data, transparency, or keyword-driven tasks.
(API and code example coming soon—see RAG Cookbook for details.)