Skip to content

Commit abc8101

Browse files
committed
Update README.md
1 parent ec334d6 commit abc8101

File tree

1 file changed

+79
-11
lines changed

1 file changed

+79
-11
lines changed

optillm/plugins/proxy/README.md

Lines changed: 79 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -11,9 +11,19 @@ A sophisticated load balancing and failover plugin for OptiLLM that distributes
1111
- 📊 **Performance Tracking**: Monitor latency and errors per provider
1212
- 🗺️ **Model Mapping**: Map model names to provider-specific deployments
1313

14+
## Installation
15+
16+
```bash
17+
# Install OptiLLM via pip
18+
pip install optillm
19+
20+
# Verify installation
21+
optillm --version
22+
```
23+
1424
## Quick Start
1525

16-
### 1. Basic Setup
26+
### 1. Create Configuration
1727

1828
Create `~/.optillm/proxy_config.yaml`:
1929

@@ -23,27 +33,69 @@ providers:
2333
base_url: https://api.openai.com/v1
2434
api_key: ${OPENAI_API_KEY}
2535
weight: 2
36+
model_map:
37+
gpt-4: gpt-4-turbo-preview # Optional: map model names
2638

2739
- name: backup
2840
base_url: https://api.openai.com/v1
2941
api_key: ${OPENAI_API_KEY_BACKUP}
3042
weight: 1
3143

3244
routing:
33-
strategy: weighted
45+
strategy: weighted # Options: weighted, round_robin, failover
46+
```
47+
48+
### 2. Start OptiLLM Server
49+
50+
```bash
51+
# Option A: Use proxy as default for ALL requests (recommended)
52+
optillm --approach proxy
53+
54+
# Option B: Start server normally (requires model prefix or extra_body)
55+
optillm
56+
57+
# With custom port
58+
optillm --approach proxy --port 8000
3459
```
3560

36-
### 2. Usage Examples
61+
### 3. Usage Examples
62+
63+
#### When using `--approach proxy` (Recommended)
64+
```bash
65+
# No need for "proxy-" prefix! The proxy handles all requests automatically
66+
curl -X POST http://localhost:8000/v1/chat/completions \
67+
-H "Content-Type: application/json" \
68+
-d '{
69+
"model": "gpt-4",
70+
"messages": [{"role": "user", "content": "Hello"}]
71+
}'
72+
73+
# The proxy will:
74+
# 1. Route to one of your configured providers
75+
# 2. Apply model mapping if configured
76+
# 3. Handle failover automatically
77+
```
3778

38-
#### Standalone Proxy
79+
#### Without `--approach proxy` flag
3980
```bash
40-
# Route requests through proxy
81+
# Method 1: Use model prefix
4182
curl -X POST http://localhost:8000/v1/chat/completions \
4283
-H "Content-Type: application/json" \
4384
-d '{
4485
"model": "proxy-gpt-4",
4586
"messages": [{"role": "user", "content": "Hello"}]
4687
}'
88+
89+
# Method 2: Use extra_body
90+
curl -X POST http://localhost:8000/v1/chat/completions \
91+
-H "Content-Type: application/json" \
92+
-d '{
93+
"model": "gpt-4",
94+
"messages": [{"role": "user", "content": "Hello"}],
95+
"extra_body": {
96+
"optillm_approach": "proxy"
97+
}
98+
}'
4799
```
48100

49101
#### Proxy with Approach/Plugin
@@ -151,20 +203,29 @@ providers:
151203
152204
### Model-Specific Routing
153205
154-
Different providers may use different model names:
206+
When using `--approach proxy`, the proxy automatically maps model names to provider-specific deployments:
155207

156208
```yaml
157209
providers:
158210
- name: azure
159211
base_url: ${AZURE_ENDPOINT}
160212
api_key: ${AZURE_KEY}
161213
model_map:
162-
# Request -> Provider mapping
214+
# Request model -> Provider deployment name
163215
gpt-4: gpt-4-deployment-001
164216
gpt-4-turbo: gpt-4-turbo-latest
165217
gpt-3.5-turbo: gpt-35-turbo-deployment
218+
219+
- name: openai
220+
base_url: https://api.openai.com/v1
221+
api_key: ${OPENAI_API_KEY}
222+
# No model_map needed - uses model names as-is
166223
```
167224

225+
With this configuration and `optillm --approach proxy`:
226+
- Request for "gpt-4" → Azure uses "gpt-4-deployment-001", OpenAI uses "gpt-4"
227+
- Request for "gpt-3.5-turbo" → Azure uses "gpt-35-turbo-deployment", OpenAI uses "gpt-3.5-turbo"
228+
168229
### Failover Configuration
169230

170231
Set up primary and backup providers:
@@ -294,16 +355,22 @@ from openai import OpenAI
294355
295356
client = OpenAI(
296357
base_url="http://localhost:8000/v1",
297-
api_key="dummy"
358+
api_key="dummy" # Can be any string when using proxy
359+
)
360+
361+
# If server started with --approach proxy:
362+
response = client.chat.completions.create(
363+
model="gpt-4", # No "proxy-" prefix needed!
364+
messages=[{"role": "user", "content": "Hello"}]
298365
)
299366
300-
# Proxy wrapping MOA approach
367+
# Or explicitly use proxy with another approach:
301368
response = client.chat.completions.create(
302369
model="gpt-4",
303370
messages=[{"role": "user", "content": "Hello"}],
304371
extra_body={
305372
"optillm_approach": "proxy",
306-
"proxy_wrap": "moa"
373+
"proxy_wrap": "moa" # Proxy will route MOA's requests
307374
}
308375
)
309376
```
@@ -312,9 +379,10 @@ response = client.chat.completions.create(
312379
```python
313380
from langchain.llms import OpenAI
314381
382+
# If server started with --approach proxy:
315383
llm = OpenAI(
316384
openai_api_base="http://localhost:8000/v1",
317-
model_name="proxy-gpt-4"
385+
model_name="gpt-4" # Proxy handles routing automatically
318386
)
319387
320388
response = llm("What is the meaning of life?")

0 commit comments

Comments
 (0)