Skip to content

Commit 169abdd

Browse files
Updated 5_minutes_RAG_no_GPU (NVIDIA#239)
* updated requirements.txt for 5-min-rag-no-gpu * add style.css file for 5-min-rag-no-gpu * add Streamlit config folder for 5-min-rag-no-gpu * updated README.md for 5-min-rag-no-gpu * updated UI and certain deprecated models and functions in main.py for 5-min-rag-no-gpu
1 parent 2cfd448 commit 169abdd

File tree

5 files changed

+197
-47
lines changed

5 files changed

+197
-47
lines changed
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
[client]
2+
showErrorDetails = false
3+
4+
[theme]
5+
primaryColor = "#76b900"
6+
backgroundColor = "white"
7+
8+
[browser]
9+
gatherUsageStats = false
Lines changed: 12 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,13 @@
1-
# RAG in 5 Minutes
1+
# Tutorial for a Generic RAG-Based Chatbot
22

3-
This implementation is tied to the [YouTube video on NVIDIA Developer](https://youtu.be/N_OOfkEWcOk).
3+
This is a tutorial for how to build your own generic RAG chatbot. It is intended as a foundation for building more complex, domain-specific RAG bots. Note that no GPU is needed to run this as it is using NIMs from the NVIDIA catalog.
44

5-
This is a simple standalone implementation showing a minimal RAG pipeline that uses models available from [NVIDIA API Catalog](https://catalog.ngc.nvidia.com/ai-foundation-models).
6-
The catalog enables you to experience state-of-the-art LLMs accelerated by NVIDIA.
7-
Developers get free credits for 10K requests to any of the models.
5+
## Acknowledgements
86

9-
The example uses an [integration package to LangChain](https://python.langchain.com/docs/integrations/providers/nvidia) to access the models.
10-
NVIDIA engineers develop, test, and maintain the open source integration.
11-
This example uses a simple [Streamlit](https://streamlit.io/) based user interface and has a one-file implementation.
12-
Because the example uses the models from the NVIDIA API Catalog, you do not need a GPU to run the example.
7+
- This implementation is based on [Rag in 5 Minutes](https://github.com/NVIDIA/GenerativeAIExamples/tree/4e86d75c813bcc41d4e92e430019053920d08c94/community/5_mins_rag_no_gpu), with changes primarily made to the UI.
8+
- Alyssa Sawyer also contributed to updating and further developing this repo during her intern project, [Resume RAG Bot](https://github.com/alysawyer/resume-rag-nv), at NVIDIA.
139

14-
### Steps
10+
## Steps
1511

1612
1. Create a python virtual environment and activate it:
1713

@@ -20,10 +16,10 @@ Because the example uses the models from the NVIDIA API Catalog, you do not need
2016
source genai/bin/activate
2117
```
2218

23-
1. From the root of this repository, `GenerativeAIExamples`, install the requirements:
19+
1. From the root of this repository, install the requirements:
2420

2521
```console
26-
pip install -r community/5_mins_rag_no_gpu/requirements.txt
22+
pip install -r requirements.txt
2723
```
2824

2925
1. Add your NVIDIA API key as an environment variable:
@@ -32,17 +28,15 @@ Because the example uses the models from the NVIDIA API Catalog, you do not need
3228
export NVIDIA_API_KEY="nvapi-*"
3329
```
3430

35-
If you don't already have an API key, visit the [NVIDIA API Catalog](https://build.ngc.nvidia.com/explore/), select on any model, then click on `Get API Key`.
31+
If you don't already have an API key, visit the [NVIDIA API Catalog](https://build.ngc.nvidia.com/explore/), select on any model, then click on `Get API Key`.
3632

3733
1. Run the example using Streamlit:
3834

3935
```console
40-
streamlit run community/5_mins_rag_no_gpu/main.py
36+
streamlit run main.py
4137
```
4238

4339
1. Test the deployed example by going to `http://<host_ip>:8501` in a web browser.
4440

45-
Click **Browse Files** and select your knowledge source.
46-
After selecting, click **Upload!** to complete the ingestion process.
47-
48-
You are all set now! Try out queries related to the knowledge base using text from the user interface.
41+
Click **Browse Files** and select the documents for your knowledge base.
42+
After selecting, click **Upload!** to complete the ingestion process.

community/5_mins_rag_no_gpu/main.py

Lines changed: 93 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -13,110 +13,176 @@
1313
# See the License for the specific language governing permissions and
1414
# limitations under the License.
1515

16-
# This is a simple standalone implementation showing rag pipeline using Nvidia AI Foundational models.
16+
# This is a simple standalone implementation showing rag pipeline using Nvidia AI Foundational Models.
1717
# It uses a simple Streamlit UI and one file implementation of a minimalistic RAG pipeline.
1818

19+
20+
############################################
21+
# Component #0.5 - UI / Header
22+
############################################
23+
1924
import streamlit as st
2025
import os
21-
from langchain_nvidia_ai_endpoints import ChatNVIDIA, NVIDIAEmbeddings
22-
from langchain.text_splitter import CharacterTextSplitter
23-
from langchain_community.document_loaders import DirectoryLoader
24-
from langchain_community.vectorstores import FAISS
25-
import pickle
26-
from langchain_core.output_parsers import StrOutputParser
27-
from langchain_core.prompts import ChatPromptTemplate
2826

29-
st.set_page_config(layout="wide")
27+
# Page settings
28+
st.set_page_config(
29+
layout="wide",
30+
page_title="RAG Chatbot",
31+
page_icon = "🤖",
32+
initial_sidebar_state="expanded")
33+
34+
# Page title
35+
st.header('Generic RAG Chatbot Demo 🤖📝', divider='rainbow')
36+
37+
# Custom CSS
38+
def local_css(file_name):
39+
with open(file_name, "r") as f:
40+
st.markdown(f"<style>{f.read()}</style>", unsafe_allow_html=True)
41+
local_css("style.css")
42+
43+
# Page description
44+
st.markdown('''Manually looking through vast amounts of data can be tedious and time-consuming. This chatbot can expedite that process by providing a platform to query your documents.''')
45+
st.warning("This is a proof of concept, and any output from the AI agent should be used in conjunction with the original data.", icon="⚠️")
46+
47+
############################################
48+
# Component #1 - Document Loader
49+
############################################
3050

31-
# Component #1 - Document Upload
3251
with st.sidebar:
52+
st.subheader("Upload Your Documents")
53+
3354
DOCS_DIR = os.path.abspath("./uploaded_docs")
55+
56+
# Make dir to store uploaded documents
3457
if not os.path.exists(DOCS_DIR):
3558
os.makedirs(DOCS_DIR)
59+
60+
# Define form on Streamlit page for uploading files to KB
3661
st.subheader("Add to the Knowledge Base")
3762
with st.form("my-form", clear_on_submit=True):
3863
uploaded_files = st.file_uploader("Upload a file to the Knowledge Base:", accept_multiple_files=True)
3964
submitted = st.form_submit_button("Upload!")
4065

66+
# Acknowledge successful file uploads
4167
if uploaded_files and submitted:
4268
for uploaded_file in uploaded_files:
4369
st.success(f"File {uploaded_file.name} uploaded successfully!")
4470
with open(os.path.join(DOCS_DIR, uploaded_file.name), "wb") as f:
4571
f.write(uploaded_file.read())
4672

47-
# Component #2 - Embedding Model and LLM
48-
llm = ChatNVIDIA(model="meta/llama3-70b-instruct")
49-
document_embedder = NVIDIAEmbeddings(model="nvidia/nv-embedqa-e5-v5", model_type="passage")
73+
############################################
74+
# Component #2 - Initalizing Embedding Model and LLM
75+
############################################
5076

77+
from langchain_nvidia_ai_endpoints import ChatNVIDIA, NVIDIAEmbeddings
78+
79+
#Make sure to export your NGC NV-Developer API key as NVIDIA_API_KEY!
80+
API_KEY = os.environ['NVIDIA_API_KEY']
81+
82+
# Select embedding model and LLM
83+
document_embedder = NVIDIAEmbeddings(model="NV-Embed-QA", api_key=API_KEY, model_type="passage", truncate="END")
84+
llm = ChatNVIDIA(model="meta/llama3-70b-instruct", api_key=API_KEY, temperature=0)
85+
86+
############################################
5187
# Component #3 - Vector Database Store
88+
############################################
89+
90+
import pickle
91+
from langchain.text_splitter import RecursiveCharacterTextSplitter
92+
from langchain_community.document_loaders import DirectoryLoader
93+
from langchain_community.vectorstores import FAISS
94+
from langchain_core.output_parsers import StrOutputParser
95+
from langchain_core.prompts import ChatPromptTemplate
96+
from langchain_core.retrievers import BaseRetriever
97+
98+
# Option for using an existing vector store
5299
with st.sidebar:
53100
use_existing_vector_store = st.radio("Use existing vector store if available", ["Yes", "No"], horizontal=True)
54101

55-
vector_store_path = "vectorstore.pkl"
102+
# Load raw documents from the directory
103+
DOCS_DIR = os.path.abspath("./uploaded_docs")
56104
raw_documents = DirectoryLoader(DOCS_DIR).load()
57105

106+
# Check for existing vector store file
107+
vector_store_path = "vectorstore.pkl"
58108
vector_store_exists = os.path.exists(vector_store_path)
59109
vectorstore = None
110+
60111
if use_existing_vector_store == "Yes" and vector_store_exists:
112+
# Load existing vector store
61113
with open(vector_store_path, "rb") as f:
62114
vectorstore = pickle.load(f)
63115
with st.sidebar:
64-
st.success("Existing vector store loaded successfully.")
116+
st.info("Existing vector store loaded successfully.")
65117
else:
66118
with st.sidebar:
67119
if raw_documents and use_existing_vector_store == "Yes":
120+
# Chunk documents
68121
with st.spinner("Splitting documents into chunks..."):
69-
text_splitter = CharacterTextSplitter(chunk_size=512, chunk_overlap=200)
122+
text_splitter = RecursiveCharacterTextSplitter(chunk_size=512, chunk_overlap=100)
70123
documents = text_splitter.split_documents(raw_documents)
71124

125+
# Convert document chunks to embeddings, and save in a vector store
72126
with st.spinner("Adding document chunks to vector database..."):
73127
vectorstore = FAISS.from_documents(documents, document_embedder)
74128

129+
# Save vector store
75130
with st.spinner("Saving vector store"):
76131
with open(vector_store_path, "wb") as f:
77132
pickle.dump(vectorstore, f)
78133
st.success("Vector store created and saved.")
79134
else:
80135
st.warning("No documents available to process!", icon="⚠️")
81136

137+
############################################
82138
# Component #4 - LLM Response Generation and Chat
83-
st.subheader("Chat with your AI Assistant, Envie!")
139+
############################################
140+
141+
st.subheader("Query your data")
84142

143+
# Save chat history for this user session
85144
if "messages" not in st.session_state:
86145
st.session_state.messages = []
87146

88147
for message in st.session_state.messages:
89148
with st.chat_message(message["role"]):
90149
st.markdown(message["content"])
91150

151+
# Define prompt for LLM
92152
prompt_template = ChatPromptTemplate.from_messages([
93-
("system", "You are a helpful AI assistant named Envie. If provided with context, use it to inform your responses. If no context is available, use your general knowledge to provide a helpful response."),
153+
("system", "You are a helpful AI assistant. Use the provided context to inform your responses. If no context is available, please state that."),
94154
("human", "{input}")
95155
])
96156

157+
# Define simple prompt chain
97158
chain = prompt_template | llm | StrOutputParser()
98159

99-
user_input = st.chat_input("Can you tell me what NVIDIA is known for?")
160+
# Display an example query for user
161+
user_query = st.chat_input("Please summarize these documents.")
100162

101-
if user_input:
102-
st.session_state.messages.append({"role": "user", "content": user_input})
163+
if user_query:
164+
st.session_state.messages.append({"role": "user", "content": user_query})
103165
with st.chat_message("user"):
104-
st.markdown(user_input)
166+
st.markdown(user_query)
105167

106168
with st.chat_message("assistant"):
107169
message_placeholder = st.empty()
108170
full_response = ""
109171

110172
if vectorstore is not None and use_existing_vector_store == "Yes":
173+
# Retrieve relevant chunks for the given user query from the vector store
111174
retriever = vectorstore.as_retriever()
112-
docs = retriever.invoke(user_input)
113-
context = "\n\n".join([doc.page_content for doc in docs])
114-
augmented_user_input = f"Context: {context}\n\nQuestion: {user_input}\n"
175+
retrieved_docs = retriever.invoke(user_query)
176+
177+
# Concatenate retrieved chunks together as context for LLM
178+
context = "\n\n".join([doc.page_content for doc in retrieved_docs])
179+
augmented_user_input = f"Context: {context}\n\nQuestion: {user_query}\n"
115180
else:
116-
augmented_user_input = f"Question: {user_input}\n"
181+
augmented_user_input = f"Question: {user_query}\n"
117182

183+
# Get output from LLM
118184
for response in chain.stream({"input": augmented_user_input}):
119185
full_response += response
120186
message_placeholder.markdown(full_response + "▌")
121187
message_placeholder.markdown(full_response)
122-
st.session_state.messages.append({"role": "assistant", "content": full_response})
188+
st.session_state.messages.append({"role": "assistant", "content": full_response})
Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,13 @@
1-
streamlit==1.30.0
1+
streamlit
22
faiss-cpu==1.7.4
3-
langchain==0.1.20
43
unstructured[all-docs]==0.11.2
4+
langchain
5+
langchain-community
6+
langchain-core
57
langchain-nvidia-ai-endpoints
8+
langchain-text-splitters
9+
nltk==3.8.1
10+
numpy==1.23.5
11+
onnx==1.16.1
12+
onnxruntime==1.15.1
13+
python-magic
Lines changed: 73 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,73 @@
1+
/* style.css */
2+
3+
/* custom footer */
4+
.footer {
5+
text-align: center;
6+
color: #666;
7+
font-size: 14px;
8+
}
9+
10+
/* NVIDIA green for headers */
11+
h1, h2, h3, h4, h5 {
12+
color: #76b900;
13+
}
14+
15+
16+
/* add line when hovering over link */
17+
.hover-link {
18+
text-decoration: none;
19+
color: inherit;
20+
position: relative;
21+
}
22+
23+
.hover-link::after {
24+
content: '';
25+
position: absolute;
26+
width: 100%;
27+
height: 1px;
28+
bottom: 0;
29+
left: 0;
30+
background-color: #000;
31+
transform: scaleX(0);
32+
transition: transform 0.3s ease-in-out;
33+
}
34+
35+
.hover-link:hover::after {
36+
transform: scaleX(1);
37+
}
38+
39+
/* Remove default formatting for links */
40+
a {
41+
color: #666;
42+
text-decoration: none;
43+
}
44+
45+
/* Remove streamlit bar */
46+
header {
47+
visibility: hidden;
48+
}
49+
50+
/* custom container */
51+
52+
.custom-image-container img {
53+
border-radius: 10px;
54+
}
55+
56+
.custom-column-container {
57+
background-color: #f0f0f0;
58+
border-radius: 10px;
59+
padding: 20px;
60+
}
61+
62+
.custom-column-container .stMarkdown {
63+
padding-right: 20px;
64+
}
65+
66+
.streamlit-expanderHeader {
67+
background-color: white;
68+
color: #76b900;
69+
}
70+
.streamlit-expanderContent {
71+
background-color: white;
72+
color: black;
73+
}

0 commit comments

Comments
 (0)