|
1 | 1 | # GraphRAG4OpenWebUI |
2 | | -GraphRAG4OpenWebUI integrates Microsoft's GraphRAG technology into Open WebUI, providing a versatile information retrieval API. It combines local, global, and web searches for advanced Q&A systems and search engines. This tool simplifies graph-based retrieval integration in open web environments. |
| 2 | + |
| 3 | +GraphRAG4OpenWebUI 是一个专为 Open WebUI 设计的 API 接口,旨在集成微软研究院的 GraphRAG(Graph-based Retrieval-Augmented Generation)技术。该项目提供了一个强大的信息检索系统,支持多种搜索模型,特别适合在开放式 Web 用户界面中使用。 |
| 4 | + |
| 5 | +## 项目概述 |
| 6 | + |
| 7 | +本项目的主要目标是为 Open WebUI 提供一个便捷的接口,以利用 GraphRAG 的强大功能。它集成了三种主要的检索方法,并提供了一个综合搜索选项,使用户能够获得全面而精确的搜索结果。 |
| 8 | + |
| 9 | +### 主要检索功能 |
| 10 | + |
| 11 | +1. **本地搜索(Local Search)** |
| 12 | + - 利用 GraphRAG 技术在本地知识库中进行高效检索 |
| 13 | + - 适用于快速访问预先定义的结构化信息 |
| 14 | + - 利用图结构提高检索的准确性和相关性 |
| 15 | + |
| 16 | +2. **全局搜索(Global Search)** |
| 17 | + - 在更广泛的范围内搜索信息,超越本地知识库的限制 |
| 18 | + - 适用于需要更全面信息的查询 |
| 19 | + - 利用 GraphRAG 的全局上下文理解能力,提供更丰富的搜索结果 |
| 20 | + |
| 21 | +3. **Tavily 搜索** |
| 22 | + - 集成外部 Tavily 搜索 API |
| 23 | + - 提供额外的互联网搜索能力,扩展信息源 |
| 24 | + - 适用于需要最新或广泛网络信息的查询 |
| 25 | + |
| 26 | +4. **全模型搜索(Full Model Search)** |
| 27 | + - 综合上述三种搜索方法 |
| 28 | + - 提供最全面的搜索结果,满足复杂的信息需求 |
| 29 | + - 自动整合和排序来自不同来源的信息 |
| 30 | + |
| 31 | +## 安装 |
| 32 | + |
| 33 | +确保你的系统中已安装 Python 3.8 或更高版本。然后,按照以下步骤安装: |
| 34 | + |
| 35 | +1. 克隆仓库: |
| 36 | + |
| 37 | + ``` |
| 38 | + git clone https://github.com/your-username/GraphRAG4OpenWebUI.git |
| 39 | + cd GraphRAG4OpenWebUI |
| 40 | + ``` |
| 41 | + |
| 42 | +2. 创建并激活虚拟环境: |
| 43 | + |
| 44 | + ``` |
| 45 | + python -m venv venv |
| 46 | + source venv/bin/activate # 在 Windows 上使用 venv\Scripts\activate |
| 47 | + ``` |
| 48 | + |
| 49 | +3. 安装依赖: |
| 50 | + |
| 51 | + ``` |
| 52 | + pip install fastapi uvicorn pandas tiktoken graphrag tavily-python pydantic python-dotenv asyncio aiohttp numpy scikit-learn matplotlib seaborn nltk spacy transformers torch torchvision torchaudio |
| 53 | + ``` |
| 54 | + |
| 55 | + 注意:`graphrag` 包可能需要从特定的源安装。如果上述命令无法安装 `graphrag`,请参考微软研究院的具体说明或联系维护者获取正确的安装方法。 |
| 56 | + |
| 57 | +## 配置 |
| 58 | + |
| 59 | +在运行 API 之前,需要设置以下环境变量。你可以通过创建 `.env` 文件或直接在终端中导出这些变量: |
| 60 | + |
| 61 | +```bash |
| 62 | +export GRAPHRAG_API_KEY="your_graphrag_api_key" |
| 63 | +export TAVILY_API_KEY="your_tavily_api_key" |
| 64 | +export GRAPHRAG_LLM_MODEL="gpt-3.5-turbo" |
| 65 | +export GRAPHRAG_EMBEDDING_MODEL="text-embedding-3-small" |
| 66 | +export INPUT_DIR="/path/to/your/input/directory" |
| 67 | +``` |
| 68 | + |
| 69 | +确保将上述命令中的占位符替换为实际的 API 密钥和路径。 |
| 70 | + |
| 71 | +## 使用方法 |
| 72 | + |
| 73 | +1. 启动服务器: |
| 74 | + |
| 75 | + ``` |
| 76 | + python main.py |
| 77 | + ``` |
| 78 | + |
| 79 | + 服务器将在 `http://localhost:8012` 上运行。 |
| 80 | + |
| 81 | +2. API 端点: |
| 82 | + |
| 83 | + - `/v1/chat/completions`: POST 请求,用于执行搜索 |
| 84 | + - `/v1/models`: GET 请求,获取可用模型列表 |
| 85 | + |
| 86 | +3. 在 Open WebUI 中集成: |
| 87 | + |
| 88 | + 在 Open WebUI 的配置中,将 API 端点设置为 `http://localhost:8012/v1/chat/completions`。这将允许 Open WebUI 使用 GraphRAG4OpenWebUI 的搜索功能。 |
| 89 | + |
| 90 | +4. 发送搜索请求示例: |
| 91 | + |
| 92 | + ```python |
| 93 | + import requests |
| 94 | + import json |
| 95 | + |
| 96 | + url = "http://localhost:8012/v1/chat/completions" |
| 97 | + headers = {"Content-Type": "application/json"} |
| 98 | + data = { |
| 99 | + "model": "full-model:latest", |
| 100 | + "messages": [{"role": "user", "content": "你的搜索查询"}], |
| 101 | + "temperature": 0.7 |
| 102 | + } |
| 103 | + |
| 104 | + response = requests.post(url, headers=headers, data=json.dumps(data)) |
| 105 | + print(response.json()) |
| 106 | + ``` |
| 107 | + |
| 108 | +## 可用模型 |
| 109 | + |
| 110 | +- `graphrag-local-search:latest`: 本地搜索 |
| 111 | +- `graphrag-global-search:latest`: 全局搜索 |
| 112 | +- `tavily-search:latest`: Tavily 搜索 |
| 113 | +- `full-model:latest`: 综合搜索(包含上述所有搜索方法) |
| 114 | + |
| 115 | +## 注意事项 |
| 116 | + |
| 117 | +- 确保在 `INPUT_DIR` 目录中有正确的输入文件(如 Parquet 文件)。 |
| 118 | +- API 使用异步编程,确保你的环境支持异步操作。 |
| 119 | +- 对于大规模部署,考虑使用生产级的 ASGI 服务器。 |
| 120 | +- 本项目专为 Open WebUI 设计,可以轻松集成到各种基于 Web 的应用中。 |
| 121 | + |
| 122 | +## 贡献 |
| 123 | + |
| 124 | +欢迎提交 Pull Requests 来改进这个项目。对于重大变更,请先开 issue 讨论你想要改变的内容。 |
| 125 | + |
| 126 | +## 许可证 |
| 127 | + |
| 128 | +[MIT License](LICENSE) |
| 129 | + |
| 130 | +--- |
| 131 | + |
| 132 | +# GraphRAG4OpenWebUI |
| 133 | + |
| 134 | +GraphRAG4OpenWebUI is an API interface specifically designed for Open WebUI, aiming to integrate Microsoft Research's GraphRAG (Graph-based Retrieval-Augmented Generation) technology. This project provides a powerful information retrieval system that supports multiple search models, particularly suitable for use in open web user interfaces. |
| 135 | + |
| 136 | +## Project Overview |
| 137 | + |
| 138 | +The main goal of this project is to provide a convenient interface for Open WebUI to leverage the powerful features of GraphRAG. It integrates three main retrieval methods and offers a comprehensive search option, allowing users to obtain thorough and precise search results. |
| 139 | + |
| 140 | +### Key Retrieval Features |
| 141 | + |
| 142 | +1. **Local Search** |
| 143 | + - Utilizes GraphRAG technology for efficient retrieval in local knowledge bases |
| 144 | + - Suitable for quick access to pre-defined structured information |
| 145 | + - Leverages graph structures to improve retrieval accuracy and relevance |
| 146 | + |
| 147 | +2. **Global Search** |
| 148 | + - Searches for information in a broader scope, beyond local knowledge bases |
| 149 | + - Suitable for queries requiring more comprehensive information |
| 150 | + - Utilizes GraphRAG's global context understanding capabilities to provide richer search results |
| 151 | + |
| 152 | +3. **Tavily Search** |
| 153 | + - Integrates external Tavily search API |
| 154 | + - Provides additional internet search capabilities, expanding information sources |
| 155 | + - Suitable for queries requiring the latest or extensive web information |
| 156 | + |
| 157 | +4. **Full Model Search** |
| 158 | + - Combines all three search methods above |
| 159 | + - Provides the most comprehensive search results, meeting complex information needs |
| 160 | + - Automatically integrates and ranks information from different sources |
| 161 | + |
| 162 | +## Installation |
| 163 | + |
| 164 | +Ensure that you have Python 3.8 or higher installed on your system. Then, follow these steps to install: |
| 165 | + |
| 166 | +1. Clone the repository: |
| 167 | + |
| 168 | + ``` |
| 169 | + git clone https://github.com/your-username/GraphRAG4OpenWebUI.git |
| 170 | + cd GraphRAG4OpenWebUI |
| 171 | + ``` |
| 172 | + |
| 173 | +2. Create and activate a virtual environment: |
| 174 | + |
| 175 | + ``` |
| 176 | + python -m venv venv |
| 177 | + source venv/bin/activate # On Windows use venv\Scripts\activate |
| 178 | + ``` |
| 179 | + |
| 180 | +3. Install dependencies: |
| 181 | + |
| 182 | + ``` |
| 183 | + pip install fastapi uvicorn pandas tiktoken graphrag tavily-python pydantic python-dotenv asyncio aiohttp numpy scikit-learn matplotlib seaborn nltk spacy transformers torch torchvision torchaudio |
| 184 | + ``` |
| 185 | + |
| 186 | + Note: The `graphrag` package might need to be installed from a specific source. If the above command fails to install `graphrag`, please refer to Microsoft Research's specific instructions or contact the maintainer for the correct installation method. |
| 187 | + |
| 188 | +## Configuration |
| 189 | + |
| 190 | +Before running the API, you need to set the following environment variables. You can do this by creating a `.env` file or exporting them directly in your terminal: |
| 191 | + |
| 192 | +```bash |
| 193 | +export GRAPHRAG_API_KEY="your_graphrag_api_key" |
| 194 | +export TAVILY_API_KEY="your_tavily_api_key" |
| 195 | +export GRAPHRAG_LLM_MODEL="gpt-3.5-turbo" |
| 196 | +export GRAPHRAG_EMBEDDING_MODEL="text-embedding-3-small" |
| 197 | +export INPUT_DIR="/path/to/your/input/directory" |
| 198 | +``` |
| 199 | + |
| 200 | +Make sure to replace the placeholders in the above commands with your actual API keys and paths. |
| 201 | + |
| 202 | +## Usage |
| 203 | + |
| 204 | +1. Start the server: |
| 205 | + |
| 206 | + ``` |
| 207 | + python main.py |
| 208 | + ``` |
| 209 | + |
| 210 | + The server will run on `http://localhost:8012`. |
| 211 | + |
| 212 | +2. API Endpoints: |
| 213 | + |
| 214 | + - `/v1/chat/completions`: POST request for performing searches |
| 215 | + - `/v1/models`: GET request to retrieve the list of available models |
| 216 | + |
| 217 | +3. Integration with Open WebUI: |
| 218 | + |
| 219 | + In the Open WebUI configuration, set the API endpoint to `http://localhost:8012/v1/chat/completions`. This will allow Open WebUI to use the search functionality of GraphRAG4OpenWebUI. |
| 220 | + |
| 221 | +4. Example search request: |
| 222 | + |
| 223 | + ```python |
| 224 | + import requests |
| 225 | + import json |
| 226 | + |
| 227 | + url = "http://localhost:8012/v1/chat/completions" |
| 228 | + headers = {"Content-Type": "application/json"} |
| 229 | + data = { |
| 230 | + "model": "full-model:latest", |
| 231 | + "messages": [{"role": "user", "content": "Your search query"}], |
| 232 | + "temperature": 0.7 |
| 233 | + } |
| 234 | + |
| 235 | + response = requests.post(url, headers=headers, data=json.dumps(data)) |
| 236 | + print(response.json()) |
| 237 | + ``` |
| 238 | + |
| 239 | +## Available Models |
| 240 | + |
| 241 | +- `graphrag-local-search:latest`: Local search |
| 242 | +- `graphrag-global-search:latest`: Global search |
| 243 | +- `tavily-search:latest`: Tavily search |
| 244 | +- `full-model:latest`: Comprehensive search (includes all search methods above) |
| 245 | + |
| 246 | +## Notes |
| 247 | + |
| 248 | +- Ensure that you have the correct input files (such as Parquet files) in the `INPUT_DIR` directory. |
| 249 | +- The API uses asynchronous programming, make sure your environment supports async operations. |
| 250 | +- For large-scale deployment, consider using a production-grade ASGI server. |
| 251 | +- This project is specifically designed for Open WebUI and can be easily integrated into various web-based applications. |
| 252 | + |
| 253 | +## Contributing |
| 254 | + |
| 255 | +Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change. |
| 256 | + |
| 257 | +## License |
| 258 | + |
| 259 | +[MIT License](LICENSE) |
0 commit comments