Feature:Add support for Pooling Model Embedding and provide an OpenAI-compatible API. #4344
+860
−45
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
本次 PR 新增了对 Pooling Embedding 模型 的支持,并提供 与 OpenAI
/v1/embeddings
完全兼容的接口实现。该功能旨在满足用户对高性能语义嵌入(Sentence Embedding)的需求,为搜索、聚类、推荐等下游任务提供更优质的嵌入表示。
主要更新内容
1. 新增:Pooling Model Embedding 支持
pooling
模型 的底层接口支持。该功能允许服务将输入序列(如文本的词向量)通过聚合操作(如平均池化)转换为固定维度的语义嵌入向量。
2. 新增:OpenAI 兼容接口实现
New Feature: 实现了 与 OpenAI
/v1/embeddings
标准完全兼容的 API 接口。兼容请求格式: 接口支持两种主流的请求类型,确保与现有 OpenAI 客户端无缝对接:
EmbeddingCompletionRequest
— 接收input
字符串或字符串列表。EmbeddingChatRequest
— 接收messages
列表,用于聊天类上下文嵌入。测试方式 (cURL 示例)
A. EmbeddingCompletionRequest 示例(标准文本输入)
B. EmbeddingChatRequest 示例(消息序列输入)
响应参数说明
以下为标准的接口响应格式,兼容 OpenAI 的
/v1/embeddings
输出规范,同时支持多样化的 embedding 数据结构:字段说明:
id
:请求唯一标识(带前缀pool-
)object
:响应对象类型,固定为"list"
created
:请求创建时间(Unix 时间戳)model
:使用的嵌入模型名称data
:嵌入结果数组,包含一个或多个 embedding 对象index
:输入序列对应的索引embedding
:嵌入向量(支持一维或二维结构)usage
:请求的 Token 使用统计