What Are Retrievers?

Concept

Retrievers are your AI search engine for large text collections or knowledge bases. They let you find the most relevant information based on a query—using either advanced embeddings (semantic search) or classic keyword matching.

Want a deep dive? Check out the RAG Cookbook for advanced agent + retrieval use cases.

Types of Retrievers

Vector Retriever

Converts documents into vectors using an embedding model, stores them, and retrieves by semantic similarity. Best for “meaning-based” search, RAG, and LLM workflows.

  • Chunks data, embeds with OpenAI or custom model
  • Stores in vector DB (like Qdrant)
  • Finds the most relevant info even with different wording

Keyword Retriever

Classic keyword search! Breaks documents and queries into tokens/keywords, and matches on those.

  • Tokenizes documents
  • Indexes by keyword
  • Fast, transparent, great for exact matches

How To Use

Semantic Retrieval with VectorRetriever

This example uses OpenAI embeddings and Qdrant vector storage for semantic search.

from camel.embeddings import OpenAIEmbedding
from camel.retrievers import VectorRetriever
from camel.storages.vectordb_storages import QdrantStorage

# Set up vector DB for embeddings
vector_storage = QdrantStorage(
    vector_dim=OpenAIEmbedding().get_output_dim(),
    collection_name="my first collection",
    path="storage_customized_run",
)
vr = VectorRetriever(embedding_model=OpenAIEmbedding(), storage=vector_storage)
# Embed and store your data (URL or file)
content_input_path = "https://www.camel-ai.org/"
vr.process(content=content_input_path)
# Run a query for semantic search
query = "What is CAMEL"
results = vr.query(query=query, similarity_threshold=0)
print(results)

AutoRetriever: Quick RAG with One Call

AutoRetriever simplifies everything: just specify storage and content, and it handles embedding, storage, and querying.

from camel.retrievers import AutoRetriever
from camel.types import StorageType

ar = AutoRetriever(
    vector_storage_local_path="camel/retrievers",
    storage_type=StorageType.QDRANT,
)

retrieved_info = ar.run_vector_retriever(
    contents=["https://www.camel-ai.org/"],   # One or many URLs/files
    query="What is CAMEL-AI",
    return_detailed_info=True,
)
print(retrieved_info)

Use AutoRetriever for fast experiments and RAG workflows; for advanced control, use VectorRetriever directly.

KeywordRetriever (Classic Search)

For simple, blazing-fast search by keyword—use KeywordRetriever.
Great for small data, transparency, or keyword-driven tasks.
(API and code example coming soon—see RAG Cookbook for details.)