Skip to content

Commit 16e754b

Browse files
authored
fixed llamaindex basic RAG location (NVIDIA#223)
1 parent a1d7056 commit 16e754b

File tree

4 files changed

+219
-0
lines changed

4 files changed

+219
-0
lines changed
File renamed without changes.
Lines changed: 86 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
# Building and Deploying LLM Assistants in Cloud
2+
3+
This application implements a GPU-accelerated Retrieval-Augmented Generation (RAG) based Question-Answering system using NVIDIA Inference Microservices (NIMs) and the LlamaIndex framework. It allows users to upload documents, process them, and then ask questions about the content.
4+
5+
## Features
6+
7+
- Document loading and processing
8+
- Vector storage and retrieval using Milvus
9+
- Question-answering capabilities using NIMs
10+
- Interactive chat interface built with Gradio
11+
12+
## Installation
13+
14+
1. Clone this repository:
15+
```
16+
git clone https://github.com/NVIDIA/GenerativeAIExamples.git
17+
cd GenerativeAIExamples/community/llm_video_series/video_1_llm_assistant_cloud_app
18+
```
19+
20+
2. Create a virtual environment (using Python 3.9 as an example):
21+
- Using `venv`:
22+
```
23+
python3.9 -m venv venv
24+
source venv/bin/activate
25+
```
26+
- Using `conda`:
27+
```
28+
conda create -n llm-assistant-env python=3.9
29+
conda activate llm-assistant-env
30+
```
31+
32+
3. Install the required Python libraries using the requirements.txt file:
33+
```
34+
pip install -r requirements.txt
35+
```
36+
37+
4. Set up your NVIDIA API Key:
38+
- Sign up for an NVIDIA API Key on [build.nvidia.com](build.nvidia.com) if you haven't already.
39+
- Set the API key as an environment variable:
40+
```
41+
export NVIDIA_API_KEY='your-api-key-here'
42+
```
43+
- Alternatively, you can directly edit the script and add your API key to the line:
44+
```python
45+
os.environ["NVIDIA_API_KEY"] = 'nvapi-XXXXXXXXXXXXXXXXXXXXXX' #Add NVIDIA API Key
46+
```
47+
48+
## Usage
49+
50+
1. Run the script:
51+
```
52+
python app.py
53+
```
54+
55+
2. Open the provided URL in your web browser to access the Gradio interface.
56+
57+
3. Use the interface to:
58+
- Upload document files
59+
- Load and process the documents
60+
- Ask questions about the loaded documents
61+
62+
## How It Works
63+
64+
1. **Document Loading**: Users can upload multiple document files through the Gradio interface.
65+
66+
2. **Document Processing**: The application uses LlamaIndex to read and process the uploaded documents, splitting them into chunks.
67+
68+
3. **Embedding and Indexing**: The processed documents are embedded using NVIDIA's embedding model and stored in a Milvus vector database.
69+
70+
4. **Question Answering**: Users can ask questions through the chat interface. The application uses NIM with Llama 3 70B Instruct hosted on cloud to generate responses based on the relevant information retrieved from the indexed documents.
71+
72+
## Customization
73+
74+
You can customize various aspects of the application:
75+
76+
- Change the chunk size for text splitting
77+
- Use different NVIDIA or open-source models for embedding or language modeling
78+
- Adjust the number of similar documents retrieved for each query
79+
80+
## Troubleshooting
81+
82+
If you encounter any issues:
83+
84+
1. Ensure your NVIDIA API Key is correctly set.
85+
2. Check that all required libraries are installed correctly.
86+
3. Verify that the Milvus database is properly initialized.
Lines changed: 126 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,126 @@
1+
# SPDX-FileCopyrightText: Copyright (c) 2023-2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+
# SPDX-License-Identifier: Apache-2.0
3+
#
4+
# Licensed under the Apache License, Version 2.0 (the "License");
5+
# you may not use this file except in compliance with the License.
6+
# You may obtain a copy of the License at
7+
#
8+
# http://www.apache.org/licenses/LICENSE-2.0
9+
#
10+
# Unless required by applicable law or agreed to in writing, software
11+
# distributed under the License is distributed on an "AS IS" BASIS,
12+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
# See the License for the specific language governing permissions and
14+
# limitations under the License.
15+
16+
17+
# Import necessary libraries
18+
import os
19+
import gradio as gr
20+
from llama_index.core import SimpleDirectoryReader, VectorStoreIndex, StorageContext, load_index_from_storage
21+
from llama_index.vector_stores.milvus import MilvusVectorStore
22+
from llama_index.embeddings.nvidia import NVIDIAEmbedding
23+
from llama_index.llms.nvidia import NVIDIA
24+
from llama_index.core.node_parser import SentenceSplitter
25+
from llama_index.core import Settings
26+
27+
# Configure settings for the application
28+
# os.environ["NVIDIA_API_KEY"] = 'nvapi-XXXXXXX' # Alternatively, set the environment variable
29+
Settings.text_splitter = SentenceSplitter(chunk_size=500)
30+
Settings.embed_model = NVIDIAEmbedding(model="NV-Embed-QA", truncate="END")
31+
Settings.llm = NVIDIA(model="meta/llama3-70b-instruct")
32+
33+
34+
# Check if NVIDIA API key is set as an environment variable
35+
if os.getenv('NVIDIA_API_KEY') is None:
36+
raise ValueError("NVIDIA_API_KEY environment variable is not set")
37+
38+
# Initialize global variables for the index and query engine
39+
index = None
40+
query_engine = None
41+
42+
# Function to get file names from file objects
43+
def get_files_from_input(file_objs):
44+
if not file_objs:
45+
return []
46+
return [file_obj.name for file_obj in file_objs]
47+
48+
# Function to load documents and create the index
49+
def load_documents(file_objs, progress=gr.Progress()):
50+
global index, query_engine
51+
try:
52+
if not file_objs:
53+
return "Error: No files selected."
54+
55+
file_paths = get_files_from_input(file_objs)
56+
documents = []
57+
for file_path in file_paths:
58+
directory = os.path.dirname(file_path)
59+
documents.extend(SimpleDirectoryReader(input_files=[file_path]).load_data())
60+
61+
if not documents:
62+
return f"No documents found in the selected files."
63+
64+
# Create a Milvus vector store and storage context
65+
vector_store = MilvusVectorStore(uri="./milvus_demo.db", dim=1024, overwrite=True)
66+
storage_context = StorageContext.from_defaults(vector_store=vector_store)
67+
68+
# Create the index from the documents
69+
index = VectorStoreIndex.from_documents(documents, storage_context=storage_context)
70+
71+
# Create the query engine
72+
query_engine = index.as_query_engine(similarity_top_k=5, streaming=True)
73+
return f"Successfully loaded {len(documents)} documents from {len(file_paths)} files."
74+
except Exception as e:
75+
return f"Error loading documents: {str(e)}"
76+
77+
# Function to handle chat interactions
78+
def chat(message, history):
79+
global query_engine
80+
if query_engine is None:
81+
return history + [("Please load documents first.", None)]
82+
try:
83+
response = query_engine.query(message)
84+
return history + [(message, response)]
85+
except Exception as e:
86+
return history + [(message, f"Error processing query: {str(e)}")]
87+
88+
# Function to stream responses
89+
def stream_response(message, history):
90+
global query_engine
91+
if query_engine is None:
92+
yield history + [("Please load documents first.", None)]
93+
return
94+
95+
try:
96+
response = query_engine.query(message)
97+
partial_response = ""
98+
for text in response.response_gen:
99+
partial_response += text
100+
yield history + [(message, partial_response)]
101+
except Exception as e:
102+
yield history + [(message, f"Error processing query: {str(e)}")]
103+
104+
# Create the Gradio interface
105+
with gr.Blocks() as demo:
106+
gr.Markdown("# RAG Q&A Chat Application")
107+
108+
with gr.Row():
109+
file_input = gr.File(label="Select files to load", file_count="multiple")
110+
load_btn = gr.Button("Load Documents")
111+
112+
load_output = gr.Textbox(label="Load Status")
113+
114+
chatbot = gr.Chatbot()
115+
msg = gr.Textbox(label="Enter your question", interactive=True)
116+
clear = gr.Button("Clear")
117+
118+
# Set up event handlers
119+
load_btn.click(load_documents, inputs=[file_input], outputs=[load_output], show_progress="hidden")
120+
msg.submit(stream_response, inputs=[msg, chatbot], outputs=[chatbot])
121+
msg.submit(lambda: "", outputs=[msg]) # Clear input box after submission
122+
clear.click(lambda: None, None, chatbot, queue=False)
123+
124+
# Launch the Gradio interface
125+
if __name__ == "__main__":
126+
demo.launch()
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
llama-index-core==0.10.58
2+
llama-index-readers-file==0.1.30
3+
llama-index-llms-nvidia==0.1.4
4+
llama-index-embeddings-nvidia==0.1.4
5+
llama-index-vector-stores-milvus==0.1.20
6+
pymilvus==2.4.4
7+
gradio==4.37.2

0 commit comments

Comments
 (0)