Skip to content

regression训练的模型如何部署 #3786

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
uRENu opened this issue Apr 7, 2025 · 7 comments
Closed

regression训练的模型如何部署 #3786

uRENu opened this issue Apr 7, 2025 · 7 comments
Labels
bug Something isn't working

Comments

@uRENu
Copy link

uRENu commented Apr 7, 2025

参考如下示例部署失败,报错requests.exceptions.HTTPError: The model does not support Chat Completions API
https://github.com/modelscope/ms-swift/blob/07bd7b463f1588362388c389dc6bdb4570bea04e/examples/train/seq_cls/regression/deploy.sh

@Jintao-Huang
Copy link
Collaborator

请拉一下main分支,已经修复这个问题了

@uRENu
Copy link
Author

uRENu commented Apr 8, 2025

请拉一下main分支,已经修复这个问题了
拉了最新的swift main分支还是有同样问题requests.exceptions.HTTPError: The model does not support Chat Completions API,以下是调用代码

from swift.llm import InferClient, InferRequest
openai_api_key = ""
openai_api_base = "http://XXXXX:8080/v1"
engine = InferClient(
api_key=openai_api_key,
base_url=openai_api_base,
)

models = engine.models
print(f'models={models}')
messages = [{
'role': 'user',
'content': content}]
resp_list = engine.infer([InferRequest(messages=messages)])
print(f'response: {resp_list}')

@Jintao-Huang
Copy link
Collaborator

我没有理解,是有什么报错吗

@uRENu
Copy link
Author

uRENu commented Apr 9, 2025

我没有理解,是有什么报错吗

engine.infer报错requests.exceptions.HTTPError: The model does not support Chat Completions API

@Jintao-Huang
Copy link
Collaborator

报错截图看看

@uRENu
Copy link
Author

uRENu commented Apr 10, 2025

报错截图看看

Image

@Jintao-Huang Jintao-Huang added the bug Something isn't working label Apr 10, 2025
@uRENu
Copy link
Author

uRENu commented Apr 27, 2025

报错截图看看

Image

在VLLM里面看了下,部署和调用方式如下:
1、部署
python -m vllm.entrypoints.openai.api_server --model /data/mymodel --port 8080
2、调用

import requests

def post_http_request(prompt: dict, api_url: str) -> requests.Response:
headers = {"User-Agent": "Test Client"}
response = requests.post(api_url, headers=headers, json=prompt)
return response

def infer_instance(content):
api_url = "http://myhost:8080/pooling"
# Input like Completions API
prompt = {"model": model, "input": content}
# print(prompt)
pooling_response = post_http_request(prompt=prompt, api_url=api_url)
print(pooling_response.json()['data'][0]['data'])

@uRENu uRENu closed this as completed Apr 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants