|
| 1 | +# End-to-end RAG example using Feast and Milvus. |
| 2 | + |
| 3 | +## Introduction |
| 4 | +This example notebook provides a step-by-step demonstration of building and using a RAG system with Feast Feature Store and the custom FeastRagRetriever. The notebook walks through: |
| 5 | + |
| 6 | +1. Data Preparation |
| 7 | + - Loads a subset of the Wikipedia DPR dataset (1% of training data) |
| 8 | + - Implements text chunking with configurable chunk size and overlap |
| 9 | + - Processes text into manageable passages with unique IDs |
| 10 | + |
| 11 | +2. Embedding Generation |
| 12 | + - Uses `all-MiniLM-L6-v2` sentence transformer model |
| 13 | + - Generates 384-dimensional embeddings for text passages |
| 14 | + - Demonstrates batch processing with GPU support |
| 15 | + |
| 16 | +3. Feature Store Setup |
| 17 | + - Creates a Parquet file as the historical data source |
| 18 | + - Configures Feast with the feature repository |
| 19 | + - Demonstrates writing embeddings from data source to Milvus online store which can be used for model training later |
| 20 | + |
| 21 | +4. RAG System Implementation |
| 22 | + - **Embedding Model**: `all-MiniLM-L6-v2` (configurable) |
| 23 | + - **Generator Model**: `granite-3.2-2b-instruct` (configurable) |
| 24 | + - **Vector Store**: Custom implementation with Feast integration |
| 25 | + - **Retriever**: Custom implementation extending HuggingFace's RagRetriever |
| 26 | + |
| 27 | +5. Query Demonstration |
| 28 | + - Perform inference with retrieved context |
| 29 | + |
| 30 | +## Requirements |
| 31 | + - A Kubernetes cluster with: |
| 32 | + - GPU nodes available (for model inference) |
| 33 | + - At least 200GB of storage |
| 34 | + - A standalone Milvus deployment. See example [here](https://github.com/milvus-io/milvus-helm/tree/master/charts/milvus). |
| 35 | + |
| 36 | +## Running the example |
| 37 | +Clone this repository: https://github.com/feast-dev/feast.git |
| 38 | +Navigate to the examples/rag-retriever directory. Here you will find the following files: |
| 39 | + |
| 40 | +* **feature_repo/feature_store.yaml** |
| 41 | + This is the core configuration file for the RAG project's feature store, configuring a Milvus online store on a local provider. |
| 42 | + * In order to configure Milvus you should: |
| 43 | + - Update `feature_store.yaml` with your Milvus connection details: |
| 44 | + - host |
| 45 | + - port (default: 19530) |
| 46 | + - credentials (if required) |
| 47 | + |
| 48 | +* **__feature_repo/ragproject_repo.py__** |
| 49 | + This is the Feast feature repository configuration that defines the schema and data source for Wikipedia passage embeddings. |
| 50 | + |
| 51 | +* **__rag_feast.ipynb__** |
| 52 | + This is a notebook demonstrating the implementation of a RAG system using Feast feature store. The notebook provides: |
| 53 | + |
| 54 | + - A complete end-to-end example of building a RAG system with: |
| 55 | + - Data preparation using the Wiki DPR dataset |
| 56 | + - Text chunking and preprocessing |
| 57 | + - Vector embedding generation using sentence-transformers |
| 58 | + - Integration with Milvus vector store |
| 59 | + - Inference utilising a custom RagRetriever: FeastRagRetriever |
| 60 | + - Uses `all-MiniLM-L6-v2` for generating embeddings |
| 61 | + - Implements `granite-3.2-2b-instruct` as the generator model |
| 62 | + |
| 63 | +Open `rag_feast.ipynb` and follow the steps in the notebook to run the example. |
| 64 | + |
| 65 | +## FeastRagRetriver Low Level Design |
| 66 | + |
| 67 | +<img src="images/FeastRagRetriever.png" width="800" height="450" alt="Low level design for feast rag retriever"> |
| 68 | + |
| 69 | +## Helpful Information |
| 70 | +- Ensure your Milvus instance is properly configured and running |
| 71 | +- Vector dimensions and similarity metrics can be adjusted in the feature store configuration |
| 72 | +- The example uses Wikipedia data, but the system can be adapted for other datasets |
0 commit comments