wangtao208208
diff --git a/‎community/README.md‎
Lines changed: 5 additions & 1 deletion b/‎community/README.md‎
Lines changed: 5 additions & 1 deletion
diff --git a/‎community/chat-and-rag-glean/README.md‎
Lines changed: 115 additions & 0 deletions b/‎community/chat-and-rag-glean/README.md‎
Lines changed: 115 additions & 0 deletions
diff --git a/‎community/chat-and-rag-glean/chat_interface_2.png‎
173 KB b/‎community/chat-and-rag-glean/chat_interface_2.png‎
173 KB
diff --git a/‎community/chat-and-rag-glean/chat_interfaced_1.png‎
159 KB b/‎community/chat-and-rag-glean/chat_interfaced_1.png‎
159 KB
diff --git a/‎community/chat-and-rag-glean/glean_example/src/agent.py‎
Lines changed: 125 additions & 0 deletions b/‎community/chat-and-rag-glean/glean_example/src/agent.py‎
Lines changed: 125 additions & 0 deletions
diff --git a/‎community/chat-and-rag-glean/glean_example/src/app/app.py‎
Lines changed: 63 additions & 0 deletions b/‎community/chat-and-rag-glean/glean_example/src/app/app.py‎
Lines changed: 63 additions & 0 deletions
diff --git a/‎community/chat-and-rag-glean/glean_example/src/app/css.py‎
Lines changed: 42 additions & 0 deletions b/‎community/chat-and-rag-glean/glean_example/src/app/css.py‎
Lines changed: 42 additions & 0 deletions
diff --git a/‎community/chat-and-rag-glean/glean_example/src/app/style.css‎
Lines changed: 16 additions & 0 deletions b/‎community/chat-and-rag-glean/glean_example/src/app/style.css‎
Lines changed: 16 additions & 0 deletions
diff --git a/‎community/chat-and-rag-glean/glean_example/src/glean_utils/test_glean_search.py‎
Lines changed: 18 additions & 0 deletions b/‎community/chat-and-rag-glean/glean_example/src/glean_utils/test_glean_search.py‎
Lines changed: 18 additions & 0 deletions
@@ -66,4 +66,8 @@ Community examples are sample code and deployments for RAG pipelines that are no
 
 * [LLM Prompt Design Helper using NIM](./llm-prompt-design-helper/)
 
-  This tool demonstrates how to utilize a user-friendly interface to interact with NVIDIA NIMs, including those available in the API catalog, self-deployed NIM endpoints, and NIMs hosted on Hugging Face. It also provides settings to integrate RAG pipelines with either local and temporary vector stores or self-hosted search engines. Developers can use this tool to design system prompts, few-shot prompts, and configure LLM settings.
+  This tool demonstrates how to utilize a user-friendly interface to interact with NVIDIA NIMs, including those available in the API catalog, self-deployed NIM endpoints, and NIMs hosted on Hugging Face. It also provides settings to integrate RAG pipelines with either local and temporary vector stores or self-hosted search engines. Developers can use this tool to design system prompts, few-shot prompts, and configure LLM settings.
+
+  * [Chatbot with RAG and Glean](./chat-and-rag-glean/)
+
+  This tool shows how to build a chat interface that uses NVIDIA NIMs along with the Glean Search API to enable internal knowledge base search, chat, and retrieval. 
@@ -0,0 +1,115 @@
+# Enterprise Knowledge Base Chatbot
+
+This repository includes a demo of a simple chat bot that answers questions based on a company's internal knowledge repository. 
+
+![chat_interace_1](./chat_interfaced_1.png)
+
+
+![chat_interace_2](./chat_interface_2.png)
+
+
+The implementation includes:
+
+- Gradio chat interface 
+- LangGraph agent
+- NVIDIA NIM microservices
+- Chroma DB for a lightweight vector DB
+- An internal knowledge base stored in Glean and available over the Glean Search API
+
+This example uses NVIDIA NIMs which can be hosted completely on-premise, which combined with the Glean on-premise offering, allows organizations to use LLMs for internal knowledge search, chat, and retrieval without any data leaving their environment.
+
+The example architecture and possible extensions are shown below.
+
+![sample_architecture](./glean_example_architecture.png)
+
+## Pre-requisites 
+
+This example uses hosted NVIDIA NIMs for the foundational LLMs. In order to use these hosted LLMds you will need a NVIDIA API key which is available at https://build.nvidia.com.
+
+```bash
+export NVIDIA_API_KEY="nvapi-YOUR-KEY"
+```
+
+This example also requires a Glean instance and API key. We recommend using a development sandbox for initial testing.
+
+```bash
+export GLEAN_API_KEY="YOUR-GLEAN-API-KEY"
+export GLEAN_API_BASE_URL="https://your-org.glean.com/rest/api/v1"
+```
+
+## Getting Started - Demo Application
+
+-  Clone the repository and navigate to this example.
+
+    ```bash
+    git clone https://github.com/NVIDIA/GenerativeAIExamples
+    cd GenerativeAIExamples/community/chat-and-rag-glean
+    ```
+
+-  Install the necessary dependencies, we recommend using  `uv` as Python installation and package manager.
+
+    ```bash
+    curl -LsSf https://astral.sh/uv/install.sh | sh # install uv
+    uv python install  # install python
+    uv sync  # install the dependencies for this project
+    ```
+
+- Run the chat app
+
+    ```bash
+    uv run glean_example/src/app/app.py
+    ```
+
+After running this command, open a browser window to `http://127.0.0.1:7860`. The web application allows a user to enter a prompt. The logs will show the main steps the application takes to answer the prompt. Full logs will be displayed in the terminal. 
+
+### Customizing the LLMs
+
+The specific LLMs used for the agent and embeddings are specified inside of the file `glean_example/src/agent.py`: 
+
+```python
+model = ChatNVIDIA(
+    model="meta/llama-3.3-70b-instruct", api_key=os.getenv("NVIDIA_API_KEY")
+)
+embeddings = NVIDIAEmbeddings(
+    model="nvidia/llama-3.2-nv-embedqa-1b-v2",
+    api_key=os.getenv("NVIDIA_API_KEY"),
+    truncate="NONE",
+)
+```
+
+
+The main LLM used is `meta/llama-3.3-70b-instruct`. Update this model name to use a different LLM.
+
+The main embedding model used is `meta/llama-3.2-nv-embedqa-1b-v2`. Update this model name to use a different embedding model.
+
+### Using on-prem 
+
+You may way to build an application similar to this demo that is hosted on-premise or in your private cloud so that no internal data leaves your systems.
+
+- Ensure you are using the [Glean "Cloud-prem" option](https://help.glean.com/en/articles/10093412-glean-deployment-options). Update the `GLEAN_API_BASE_URL` to use your on-prem Glean installation. 
+- Follow the appropriate [NVIDIA NIM deployment guide](https://docs.nvidia.com/nim/large-language-models/latest/deployment-guide.html) for your environment. You will need to deploy at least one NVIDIA NIM foundational LLM and one NVIDIA NIM embedding model. The result of following this guide will be two on-premise URL endpoints.
+- Update the file `glean_example/src/agent.py` to use the on-prem endpoints: 
+
+    ```python
+    model = ChatNVIDIA(
+        model="meta/llama-3.3-70b-instruct", 
+        base_url="http://localhost:8000/v1", # Update to the on-prem URL where your NVIDIA NIM is running
+        api_key=os.getenv("NVIDIA_API_KEY")
+    )
+    embeddings = NVIDIAEmbeddings(
+        model="nvidia/llama-3.2-nv-embedqa-1b-v2",
+        base_url="http://localhost:8000/v1", # Update to the on-prem URL where your NVIDIA NIM is running
+        api_key=os.getenv("NVIDIA_API_KEY"),
+        truncate="NONE",
+    )
+    ```
+
+
+
+## Getting Started - Jupyter Notebook
+
+Further details about the code, and an example that calls a chatbot without a web application, is available in the Jupyter Notebook `nvidia_nim_langgraph_glean_example.iypnb`.
+
+```
+uv run jupyter lab 
+```
@@ -0,0 +1,125 @@
+import os
+from langchain_chroma import Chroma
+from typing import List, Tuple, Optional, Any
+from langgraph.graph import StateGraph, START, END
+from pydantic import BaseModel
+from glean_example.src.glean_utils.utils import (
+    glean_search,
+    documents_from_glean_response,
+)
+from glean_example.src.prompts import PROMPT_GLEAN_QUERY, PROMPT_ANSWER
+from langchain_nvidia_ai_endpoints import ChatNVIDIA, NVIDIAEmbeddings
+import logging
+
+model = ChatNVIDIA(
+    model="meta/llama-3.3-70b-instruct", api_key=os.getenv("NVIDIA_API_KEY")
+)
+embeddings = NVIDIAEmbeddings(
+    model="nvidia/llama-3.2-nv-embedqa-1b-v2",
+    api_key=os.getenv("NVIDIA_API_KEY"),
+    truncate="NONE",
+)
+
+glean_api_key = os.getenv("GLEAN_API_KEY")
+base_url = os.getenv("GLEAN_API_BASE_URL") 
+chroma_db_path = "."
+
+logger = logging.getLogger("gradio_log")
+
+
+class InfoBotState(BaseModel):
+    messages: List[Tuple[str, str]] = None
+    glean_query: Optional[str] = None
+    glean_results: Optional[List[str]] = None
+    db: Optional[Any] = None
+    answer_candidate: Optional[str] = None
+
+
+def call_glean(state: InfoBotState):
+    """Call the Glean Search API with a user query and it will return relevant results"""
+    logger.info("Calling Glean")
+    response = glean_search(
+        query=state.glean_query, api_key=glean_api_key, base_url=base_url
+    )
+    state.glean_results = documents_from_glean_response(response)
+    return state
+
+
+def add_embeddings(state: InfoBotState):
+    """Update the vector DB with glean search results"""
+    logger.info("Adding Embeddings")
+    db = Chroma.from_texts(
+        state.glean_results, embedding=embeddings, persist_directory=chroma_db_path
+    )
+    state.db = db
+    return state
+
+
+def answer_candidates(state: InfoBotState):
+    """Use RAG to get most likely answer"""
+    logger.info("RAG on Embeddings")
+    most_recent_message: Tuple[str, str] = state.messages[-1]
+    role, query = most_recent_message
+    retriever = state.db.as_retriever(search_kwargs={"k": 1})
+    docs = retriever.invoke(query)
+    state.answer_candidate = docs[0].page_content
+    return state
+
+
+def create_glean_query(state: InfoBotState):
+    """parses the user message and creates an appropriate glean query"""
+    logger.info("Glean Query from User Message")
+    most_recent_message: Tuple[str, str] = state.messages[-1]
+    role, query = most_recent_message
+
+    llm = PROMPT_GLEAN_QUERY | model
+    response = llm.invoke({"query": query})
+
+    state.glean_query = response.content
+
+    return state
+
+
+def call_bot(state: InfoBotState):
+    """the main agent responsible for taking all the context and answering the question"""
+    logger.info("Generate final answer")
+
+    llm = PROMPT_ANSWER | model
+
+    response = llm.invoke(
+        {
+            "messages": state.messages,
+            "glean_query": state.glean_query,
+            "glean_search_result_documents": state.glean_results,
+            "answer_candidate": state.answer_candidate,
+        }
+    )
+    state.messages.append(("agent", response.content))
+    return state
+
+
+# Define the graph
+
+graph = StateGraph(InfoBotState)
+graph.add_node("call_bot", call_bot)
+graph.add_node("call_glean", call_glean)
+graph.add_node("answer_candidates", answer_candidates)
+graph.add_node("create_glean_query", create_glean_query)
+graph.add_node("add_embeddings", add_embeddings)
+
+graph.add_edge(START, "create_glean_query")
+graph.add_edge("create_glean_query", "call_glean")
+graph.add_edge("call_glean", "add_embeddings")
+graph.add_edge("add_embeddings", "answer_candidates")
+graph.add_edge("answer_candidates", "call_bot")
+graph.add_edge("call_bot", END)
+agent = graph.compile()
+
+
+if __name__ == "__main__":
+    msg = "do I need to take PTO if I am sick"
+    history = []
+    history.append(("user", msg))
+    messages = history
+    response = agent.invoke({"messages": messages})
+    logger.info(response["messages"][-1][1])
@@ -0,0 +1,63 @@
+import logging
+
+import gradio as gr
+from glean_example.src.app.css import css, theme
+from glean_example.src.agent import agent
+from typing import List
+from pathlib import Path
+from gradio_log import Log
+
+log_file = "/tmp/gradio_log.txt"
+Path(log_file).touch()
+
+ch = logging.FileHandler(log_file)
+ch.setLevel(logging.DEBUG)
+
+
+logger = logging.getLogger("gradio_log")
+logger.setLevel(logging.DEBUG)
+for handler in logger.handlers:
+    logger.removeHandler(handler)
+logger.addHandler(ch)
+
+
+def convert_to_langchain_history(history: List):
+    if len(history) < 1:
+        return []
+
+    langchain_history = []
+    for msg_pair in history:
+        msg1, msg2 = msg_pair
+        if msg1 == "user":
+            langchain_history.append(("user", msg2))
+        if msg1 != "user":
+            langchain_history.append(("system", msg2))
+
+    return langchain_history
+
+
+def agent_predict(msg: str, history: List) -> str:
+    history = convert_to_langchain_history(history)
+
+    history.append(("user", msg))
+    response = agent.invoke(input={"messages": history})
+    return response["messages"][-1][1]
+
+
+chatbot = gr.Chatbot(label="NVBot Lite", elem_id="chatbot", show_copy_button=True)
+
+with gr.Blocks(theme=theme, css=css) as chat:
+    chat_interface = gr.ChatInterface(
+        fn=agent_predict,
+        chatbot=chatbot,
+        title="NVIDIA Information Demo",
+        autofocus=True,
+        fill_height=True,
+    )
+
+    Log(log_file=log_file)
+
+# chat_interface.render()
+
+if __name__ == "__main__":
+    chat.queue().launch(share=False)
@@ -0,0 +1,42 @@
+import os
+import pathlib
+import gradio as gr
+
+bot_title = os.getenv("BOT_TITLE", "NVIDIA Inference Microservice")
+
+header = f"""
+<span style="color:#76B900;font-weight:600;font-size:28px">
+{bot_title}
+</span>
+"""
+
+styles = pathlib.Path(__file__).parent.joinpath("style.css").resolve()
+with open(styles, "r") as file:
+    css = file.read()
+
+theme = gr.themes.Monochrome(
+    primary_hue="emerald", secondary_hue="green", font=["sans-serif"]
+).set(
+    button_primary_background_fill="#76B900",
+    button_primary_background_fill_dark="#76B900",
+    button_primary_background_fill_hover="#569700",
+    button_primary_background_fill_hover_dark="#569700",
+    button_primary_text_color="#000000",
+    button_primary_text_color_dark="#ffffff",
+    button_secondary_background_fill="#76B900",
+    button_secondary_background_fill_dark="#76B900",
+    button_secondary_background_fill_hover="#569700",
+    button_secondary_background_fill_hover_dark="#569700",
+    button_secondary_text_color="#000000",
+    button_secondary_text_color_dark="#ffffff",
+    slider_color="#76B900",
+    color_accent="#76B900",
+    color_accent_soft="#76B900",
+    body_text_color="#000000",
+    body_text_color_dark="#ffffff",
+    color_accent_soft_dark="#76B900",
+    border_color_accent="#ededed",
+    border_color_accent_dark="#3d3c3d",
+    block_title_text_color="#000000",
+    block_title_text_color_dark="#ffffff",
+)
@@ -0,0 +1,16 @@
+.header {
+    padding: 60px;
+    text-align: center;
+    color: #76b900;
+    font-size: 30px;
+  }
+  
+  #chatbot {
+    flex-grow: 2;
+    overflow: auto;
+  }
+  
+  footer {
+    visibility: hidden;
+  }
+    
@@ -0,0 +1,18 @@
+import os
+from glean_example.src.glean_utils.utils import (
+    glean_search,
+    documents_from_glean_response,
+)
+
+api_key =  os.getenv("GLEAN_API_KEY")
+base_url = "https://nvidia-be.glean.com/rest/api/v1"
+
+response = glean_search(
+    query="us holidays",
+    api_key=api_key,
+    base_url=base_url,
+)
+
+documents = documents_from_glean_response(response)
+
+print(documents)