🧱 RAG Basics Course | 🚀 RAG Toolkit | 🩸 RAG Survey Papers |
Topic | Description | Link |
---|---|---|
What is RAG? | Explain RAG in with a simple example. | Link |
Why RAG? | Explain the drawbacks of LLMs and how RAG addresses them. | Link |
How does RAG work? | Explain the different steps in RAG - Indexing, Retrieval, Augmentation and Generation. | Link |
RAG Benefits and Challenges | Discusses the benefits and challenges of RAG. | Link |
RAG Must Know Terms | Definitions of RAG must know terms. | Link |
RAG Roadmap | Detailed roadmap to learn RAG from basics to advanced. | Link |
RAG Developer's Stack | Covers the various libraries used to build RAG systems | Link |
RAG from Scratch | RAG implementation from scratch without any frameworks. | Link |
RAG with LangChain | RAG implementation using LangChain framework. | Link |
Website RAG | RAG over a website implemented using LangChain framework. | Link |
YouTube Video RAG | RAG over a YouTube video transcript implemented using LangChain framework. | Link |
Agentic RAG | Agentic RAG system implemented using CrewAI framework. | Link |
🔴Frameworks🔴
Library | Description | Link |
---|---|---|
LangChain | LangChain is a framework for developing applications powered by large language models (LLMs). | Link |
Llama Index | LlamaIndex is a data framework for your LLM applications | Link |
Haystack | Haystack is an end-to-end LLM framework that allows you to build applications powered by LLMs, Transformer models, vector search and more. | Link |
fastRAG | Research framework for efficient and optimized retrieval augmented generative pipelines, incorporating state-of-the-art LLMs and Information Retrieval. | Link |
Llmware | Unified framework for building enterprise RAG pipelines with small, specialized models | Link |
🟠Research🟠
Library | Description | Link |
---|---|---|
FlashRAG | A Python Toolkit for Efficient RAG Research. This toolkit includes 36 pre-processed benchmark RAG datasets and 16 state-of-the-art RAG algorithms. | Link |
🟡Data Extraction - Web Scraping🟡
Library | Description | Link |
---|---|---|
Crawl4AI (Web Scraping) | Open-source LLM Friendly Web Crawler & Scrapper | Link |
ScrapeGraphAI (Web & Document) | A web scraping Python library that uses LLM and direct graph logic to create scraping pipelines for websites and local documents (XML, HTML, JSON, Markdown, etc.). | Link |
Crawlee (Web Scraping) | A web scraping and browser automation library | Link |
🟢Data Extraction - Documents🟢
Library | Description | Link |
---|---|---|
Docling (Document) | Docling parses documents and exports them to the desired format with ease and speed. | Link |
Llama Parse (Document) | GenAI-native document parser that can parse complex document data for any downstream LLM use case (RAG, agents). | Link |
PyMuPDF4LLM (Document) | PyMuPDF4LLM library makes it easier to extract PDF content in the format you need for LLM & RAG environments. | Link |
MegaParse (Document) | Parser for every type of documents | Link |
ExtractThinker (Document) | Document Intelligence library for LLMs | Link |
🔵Vector Database🔵
Library | Description | Link |
---|---|---|
SQLite-Vec | A vector search SQLite extension that runs anywhere! | Link |
FAISS | A library for efficient similarity search and clustering of dense vectors. | Link |
PGVector | Open-source vector similarity search for Postgres | Link |
Chroma | The AI-native open-source embedding database. The fastest way to build Python or JavaScript LLM apps with memory! | Link |
Qdrant | High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. | Link |
Pincone | The vector database for machine learning applications. | Link |
Weaviate | Weaviate is a cloud-native, open source vector database that is robust, fast, and scalable. | Link |
Milvus | Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search | Link |
🟣Chunking🟣
Library | Description | Link |
---|---|---|
Chonkie | RAG chunking library that is lightweight, lightning-fast, and easy to use. The no-nonsense RAG chunking library. This library supports seven different chunking strategies. | Link |
🟤Rerankers🟤
Library | Description | Link |
---|---|---|
Rerankers | A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models. Any new reranking models can be added with very little knowledge of the codebase. | Link |
🟠Agentic RAG🟠
Library | Description | Link |
---|---|---|
CrewAI | Framework for orchestrating role-playing, autonomous AI agents. | Link |
Agno | Build AI Agents with memory, knowledge, tools and reasoning. Chat with them using a beautiful Agent UI. | Link |
LangGraph | Build resilient language agents as graphs. | Link |
AutoGen | An open-source framework for building AI agent systems. | Link |
R2R | Agentic Retrieval-Augmented Generation (RAG) with a RESTful API. R2R offers multimodal content ingestion, hybrid search functionality, knowledge graphs, and comprehensive user and document management. | Link |
Vectara | Build Agentic RAG applications. | Link |
🟢Graph RAG🟢
Library | Description | Link |
---|---|---|
GraphRAG | A modular graph-based Retrieval-Augmented Generation (RAG) system. | Link |
Nano GraphRAG | A simple, easy-to-hack GraphRAG implementation. | Link |
FastGraph RAG | Streamlined and promptable Fast GraphRAG framework designed for interpretable, high-precision, agent-driven retrieval workflows. | Link |
🔴Evaluation🔴
Library | Description | Link |
---|---|---|
RAGChecker | A Fine-grained Framework For Diagnosing RAG. | Link |
BeyondLLM | Beyond LLM offers an all-in-one toolkit for experimentation, evaluation, and deployment of Retrieval-Augmented Generation (RAG) systems | Link |
RAGAS | Ragas is your ultimate toolkit for evaluating and optimizing Large Language Model (LLM) applications. | Link |
Giskard | Open-Source Evaluation & Testing for ML & LLM systems. | Link |
DeepEval | The LLM (RAG) Evaluation Framework. | Link |
Paper | Category | Link |
---|---|---|
Retrieval-Augmented Generation for Large Language Models: A Survey | General | Link |
Retrieval-Augmented Generation for Natural Language Processing: A Survey | General | Link |
A Comprehensive Survey of Retrieval-Augmented Generation (RAG): Evolution, Current Landscape and Future Directions | General | Link |
Retrieval-Augmented Generation for AI-Generated Content: A Survey | General | Link |
A Survey on RAG Meeting LLMs: Towards Retrieval-Augmented Large Language Models | General | Link |
A Survey on Retrieval-Augmented Text Generation for Large Language Models | General | Link |
Retrieval Augmented Generation (RAG) and Beyond: A Comprehensive Survey on How to Make your LLMs use External Data More Wisely | General | Link |
Graph Retrieval-Augmented Generation: A Survey | Graph RAG | Link |
Agentic Retrieval-Augmented Generation: A Survey on Agentic RAG | Agentic RAG | Link |
Evaluation of Retrieval-Augmented Generation: A Survey | Evaluation | Link |
Searching for Best Practices in Retrieval-Augmented Generation | RAG Best Practices | Link |