modelscope · Jintao-Huang · Apr 22, 2025 · Apr 17, 2025 · Apr 17, 2025 · Apr 17, 2025
diff --git a/MANIFEST.in b/MANIFEST.in
@@ -2,4 +2,4 @@ recursive-include swift/utils *.py
 recursive-include swift/llm/dataset/data *.*
 recursive-include swift/llm/ds_config *.json
 recursive-include requirements *.txt
-recursive-include swift/plugin/agent *.json
+recursive-include swift/plugin/loss_scale/config *.json
diff --git a/README.md b/README.md
@@ -126,7 +126,7 @@ Running Environment:
 | peft | >=0.11,<0.16 | ||
 | trl | >=0.13,<0.17 | 0.16 |RLHF|
 | deepspeed    | >=0.14       | 0.14.5 | Training                                  |
-| vllm         | >=0.5.1      | 0.7.3/0.8.3       | Inference/Deployment/Evaluation           |
+| vllm         | >=0.5.1      | 0.7.3/0.8.4       | Inference/Deployment/Evaluation           |
 | lmdeploy     | >=0.5        | 0.7.2.post1       | Inference/Deployment/Evaluation           |
 | evalscope | >=0.11       |  | Evaluation |
 

diff --git a/README_CN.md b/README_CN.md
@@ -121,7 +121,7 @@ pip install -e .
 | peft | >=0.11,<0.16 | ||
 | trl | >=0.13,<0.17 | 0.16 |RLHF|
 | deepspeed | >=0.14       | 0.14.5 |训练|
-| vllm | >=0.5.1      | 0.7.3/0.8.3 |推理/部署/评测|
+| vllm | >=0.5.1      | 0.7.3/0.8.4 |推理/部署/评测|
 | lmdeploy | >=0.5        | 0.7.2.post1 |推理/部署/评测|
 | evalscope | >=0.11       | |评测|
 

diff --git a/docs/resources/image-20240201122323540.png b/docs/resources/image-20240201122323540.png
diff --git a/docs/resources/image-20240201122441874.png b/docs/resources/image-20240201122441874.png
diff --git a/docs/resources/image-20240201122625473.png b/docs/resources/image-20240201122625473.png
diff --git a/docs/resources/image-20240201122725477.png b/docs/resources/image-20240201122725477.png
diff --git a/docs/resources/image-20240201131811038.png b/docs/resources/image-20240201131811038.png
diff --git a/docs/resources/image-20240201132124061.png b/docs/resources/image-20240201132124061.png
diff --git a/docs/resources/image-20240201132139698.png b/docs/resources/image-20240201132139698.png
diff --git a/docs/resources/image-20240201132308260.png b/docs/resources/image-20240201132308260.png
diff --git a/docs/resources/image-20240201132421298.png b/docs/resources/image-20240201132421298.png
diff --git a/docs/resources/image-20240201132454465.png b/docs/resources/image-20240201132454465.png
diff --git a/docs/source/Customization/插件化.md b/docs/source/Customization/插件化.md
@@ -41,7 +41,7 @@ def loss_scale_func(outputs, labels, loss_scale=None, num_items_in_batch=None) -
 
 ## 定制化loss_scale
 
-example在[这里](https://github.com/modelscope/swift/blob/main/swift/plugin/loss_scale.py).
+example在[这里](https://github.com/modelscope/swift/blob/main/swift/plugin/loss_scale/loss_scale.py).
 
 loss_scale机制在SWIFT中是非常重要的机制之一。在pt和sft任务中，可训练token的loss是均匀的，即每个token平等的进行bp。但在某些情况下，某些token的权重比较大，需要被额外关注，
 在这种情况下就需要更高的权重。loss_scale可以让开发者自由地定义自己的token权重。
@@ -84,6 +84,7 @@ def get_metric(metric: str):
 ## 定制化optimizer
 
 example在[这里](https://github.com/modelscope/swift/blob/main/swift/plugin/optimizer.py).
+- 对模型不同部分采用不同的学习率，例如：ViT和LLM分别使用不同的学习率，参考[这里](https://github.com/modelscope/ms-swift/blob/main/examples/train/multimodal/lora_llm_full_vit/custom_plugin.py)。
 
 用户可以在这里增加自己的optimizer和lr_scheduler实现：
 ```python
@@ -103,42 +104,15 @@ optimizers_map = {
 ```
 就可以实际调用了。
 
-## 定制化tools
+## 定制化agent template
 
-example在[这里](https://github.com/modelscope/swift/blob/main/swift/plugin/tools.py).
-
-可以在这里定义Agent训练的tools格式。tools格式是指训练和推理时如何将工具枚举在system字段中，例如glm4就有其独特的tools格式：
-```python
-def format_glm4(tool_names, tool_descs):
-    GLM4_PROMPT = """你是一个名为 ChatGLM 的人工智能助手。你是基于智谱AI训练的语言模型 GLM-4 模型开发的，你的任务是针对用户的问题和要求提供适当的答复和支持。
-
-# 可用工具
-
-{tool_list}"""
-    tool_descs = [json.dumps(t) if not isinstance(t, str) else t for t in tool_descs]
-    tool_list = ''
-    for name, tool in zip(tool_names, tool_descs):
-        tool_list += f'## {name}\n\n{tool}\n\n'
-    return GLM4_PROMPT.format(tool_list=tool_list)
-```
-system中的完整格式类似于这样：
-```text
-你是一个名为 ChatGLM 的人工智能助手。你是基于智谱AI训练的语言模型 GLM-4 模型开发的，你的任务是针对用户的问题和要求提供适当的答复和支持。
-
-# 可用工具
-
-## 查看天气
-
-...
-
-## 搜索网络
-
-...
-```
+example在[这里](https://github.com/modelscope/swift/blob/main/swift/plugin/agent_template).
 
 ## 定制化tuner
 
 example在[这里](https://github.com/modelscope/swift/blob/main/swift/plugin/tuner.py).
+- 多模态模型对ViT部分使用全参数训练，LLM部分使用LoRA训练，参考[这里](https://github.com/modelscope/ms-swift/tree/main/examples/train/multimodal/lora_llm_full_vit)。
+- Phi4-multimodal，直接对其已有LoRA进行训练而不额外附加LoRA，参考[这里](https://github.com/modelscope/ms-swift/blob/main/examples/train/plugins/tuner_phi4_mm.sh)。
 
 tuner定制也是swift中有特色的能力之一，开发者可以无视复杂的tuner初始化流程和代码整合成本，将新的tuner注册在这里：
 ```python

diff --git a/docs/source/Customization/自定义数据集.md b/docs/source/Customization/自定义数据集.md
@@ -163,9 +163,18 @@ query-response格式：
 ```
 
 ### Agent格式
-
-Agent格式请参考[Agent文档](../Instruction/智能体的支持.md)。
-
+这里分别提供了纯文本Agent和多模态Agent的示例数据样本：
+```jsonl
+{"tools": ["{\"type\": \"function\", \"function\": {\"name\": \"realtime_aqi\", \"description\": \"天气预报。获取实时空气质量。当前空气质量，PM2.5，PM10信息\", \"parameters\": {\"type\": \"object\", \"properties\": {\"city\": {\"type\": \"string\", \"description\": \"城市名，例如：上海\"}}, \"required\": [\"city\"]}}}"], "messages": [{"role": "user", "content": "北京和上海今天的天气情况"}, {"role": "tool_call", "content": "{\"name\": \"realtime_aqi\", \"arguments\": {\"city\": \"北京\"}}"}, {"role": "tool_call", "content": "{\"name\": \"realtime_aqi\", \"arguments\": {\"city\": \"上海\"}}"}, {"role": "tool_response", "content": "{\"city\": \"北京\", \"aqi\": \"10\", \"unit\": \"celsius\"}"}, {"role": "tool_response", "content": "{\"city\": \"上海\", \"aqi\": \"72\", \"unit\": \"fahrenheit\"}"}, {"role": "assistant", "content": "根据天气预报工具，北京今天的空气质量指数为10，属于良好水平；上海今天的空气质量指数为72，属于轻度污染水平。"}]}
+{"tools": ["{\"type\": \"function\", \"function\": {\"name\": \"click\", \"description\": \"点击屏幕中的某个位置\", \"parameters\": {\"type\": \"object\", \"properties\": {\"x\": {\"type\": \"integer\", \"description\": \"横坐标，表示屏幕上的水平位置\"}, \"y\": {\"type\": \"integer\", \"description\": \"纵坐标，表示屏幕上的垂直位置\"}}, \"required\": [\"x\", \"y\"]}}}"], "messages": [{"role": "user", "content": "<image>现在几点了？"}, {"role": "assistant", "content": "<think>\n我可以通过打开日历App来获取当前时间。\n</think>\n"}, {"role": "tool_call", "content": "{\"name\": \"click\", \"arguments\": {\"x\": 105, \"y\": 132}}"}, {"role": "tool_response", "content": "{\"images\": \"<image>\", \"status\": \"success\"}"}, {"role": "assistant", "content": "成功打开日历App，现在的时间为中午11点"}], "images": ["desktop.png", "calendar.png"]}
+```
+- agent_template为"react_en", "hermes"等情况下，该格式适配所有模型Agent训练，可以轻松在不同模型间切换。
+- 其中tools是一个`List[str]`，其中每一个tool需要是json字符串，messages中role为'tool_call'和'tool_response/tool'的content部分都需要是json字符串。
+- tools字段将在训练/推理时和`{"role": "system", ...}"`部分组合，根据agent_template组成完整的system部分。
+- `{"role": "tool_call", ...}`部分将根据agent_template自动转成对应格式的`{"role": "assistant", ...}`，多条连续的`{"role": "assistant", ...}`将拼接在一起组成完整的assistant_content。
+- `{"role": "tool_response", ...}`也可以写成`{"role": "tool", ...}`，这两种写法是等价的。该部分也将根据`agent_template`自动转换格式。该部分在训练时将不进行损失的计算，角色类似于`{"role": "user", ...}`。
+- 该格式支持并行调用工具，例子参考第一条数据样本。多模态Agent数据样本中`<image>`标签数量应与"images"长度相同，其标签位置代表图像特征的插入位置。当然也支持其他模态，例如audios, videos。
+- 更多请参考[Agent文档](../Instruction/Agent支持.md)。
 
 ## dataset_info.json
 

diff --git a/docs/source/GetStarted/SWIFT安装.md b/docs/source/GetStarted/SWIFT安装.md
@@ -72,7 +72,7 @@ modelscope-registry.us-west-1.cr.aliyuncs.com/modelscope-repo/modelscope:ubuntu2
 | peft | >=0.11,<0.16 | ||
 | trl | >=0.13,<0.17 | 0.16 |RLHF|
 | deepspeed | >=0.14       | 0.14.5 |训练|
-| vllm | >=0.5.1      | 0.7.3/0.8.3 |推理/部署/评测|
+| vllm | >=0.5.1      | 0.7.3/0.8.4 |推理/部署/评测|
 | lmdeploy | >=0.5        | 0.7.2.post1 |推理/部署/评测|
 | evalscope | >=0.11       | |评测|
 

diff --git a/docs/source/Instruction/Agent支持.md b/docs/source/Instruction/Agent支持.md
@@ -0,0 +1,194 @@
+# Agent支持
+
+## 数据集格式
+
+纯文本Agent和多模态Agent的示例数据样本如下：
+```jsonl
+{"tools": ["{\"type\": \"function\", \"function\": {\"name\": \"realtime_aqi\", \"description\": \"天气预报。获取实时空气质量。当前空气质量，PM2.5，PM10信息\", \"parameters\": {\"type\": \"object\", \"properties\": {\"city\": {\"type\": \"string\", \"description\": \"城市名，例如：上海\"}}, \"required\": [\"city\"]}}}"], "messages": [{"role": "user", "content": "北京和上海今天的天气情况"}, {"role": "tool_call", "content": "{\"name\": \"realtime_aqi\", \"arguments\": {\"city\": \"北京\"}}"}, {"role": "tool_call", "content": "{\"name\": \"realtime_aqi\", \"arguments\": {\"city\": \"上海\"}}"}, {"role": "tool_response", "content": "{\"city\": \"北京\", \"aqi\": \"10\", \"unit\": \"celsius\"}"}, {"role": "tool_response", "content": "{\"city\": \"上海\", \"aqi\": \"72\", \"unit\": \"fahrenheit\"}"}, {"role": "assistant", "content": "根据天气预报工具，北京今天的空气质量指数为10，属于良好水平；上海今天的空气质量指数为72，属于轻度污染水平。"}]}
+{"tools": ["{\"type\": \"function\", \"function\": {\"name\": \"click\", \"description\": \"点击屏幕中的某个位置\", \"parameters\": {\"type\": \"object\", \"properties\": {\"x\": {\"type\": \"integer\", \"description\": \"横坐标，表示屏幕上的水平位置\"}, \"y\": {\"type\": \"integer\", \"description\": \"纵坐标，表示屏幕上的垂直位置\"}}, \"required\": [\"x\", \"y\"]}}}"], "messages": [{"role": "user", "content": "<image>现在几点了？"}, {"role": "assistant", "content": "<think>\n我可以通过打开日历App来获取当前时间。\n</think>\n"}, {"role": "tool_call", "content": "{\"name\": \"click\", \"arguments\": {\"x\": 105, \"y\": 132}}"}, {"role": "tool_response", "content": "{\"images\": \"<image>\", \"status\": \"success\"}"}, {"role": "assistant", "content": "成功打开日历App，现在的时间为中午11点"}], "images": ["desktop.png", "calendar.png"]}
+```
+- agent_template为"react_en", "hermes"等情况下，该格式适配所有模型Agent训练，可以轻松在不同模型间切换。
+- 其中tools是一个`List[str]`，其中每一个tool需要是json字符串，messages中role为'tool_call'和'tool_response/tool'的content部分都需要是json字符串。
+- tools字段将在训练/推理时和`{"role": "system", ...}"`部分组合，根据agent_template组成完整的system部分。
+- `{"role": "tool_call", ...}`部分将根据agent_template自动转成对应格式的`{"role": "assistant", ...}`，多条连续的`{"role": "assistant", ...}`将拼接在一起组成完整的assistant_content。
+- `{"role": "tool_response", ...}`也可以写成`{"role": "tool", ...}`，这两种写法是等价的。该部分也将根据`agent_template`自动转换格式。该部分在训练时将不进行损失的计算，角色类似于`{"role": "user", ...}`。
+- 该格式支持并行调用工具，例子参考第一条数据样本。多模态Agent数据样本中`<image>`标签数量应与"images"长度相同，其标签位置代表图像特征的插入位置。当然也支持其他模态，例如audios, videos。
+
+以下为上述两条数据样本由qwen2_5和qwen2_5_vl的template进行encode后的input_ids和labels，选择的agent_template为**hermes**：
+
+样本一：
+```text
+[INPUT_IDS] <|im_start|>system
+You are Qwen, created by Alibaba Cloud. You are a helpful assistant.
+
+# Tools
+
+You may call one or more functions to assist with the user query.
+
+You are provided with function signatures within <tools></tools> XML tags:
+<tools>
+{"type": "function", "function": {"name": "realtime_aqi", "description": "天气预报。获取实时空气质量。当前空气质量，PM2.5，PM10信息", "parameters": {"type": "object", "properties": {"city": {"type": "string", "description": "城市名，例如：上海"}}, "required": ["city"]}}}
+</tools>
+
+For each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:
+<tool_call>
+{"name": <function-name>, "arguments": <args-json-object>}
+</tool_call><|im_end|>
+<|im_start|>user
+北京和上海今天的天气情况<|im_end|>
+<|im_start|>assistant
+<tool_call>
+{"name": "realtime_aqi", "arguments": {"city": "北京"}}
+</tool_call>
+<tool_call>
+{"name": "realtime_aqi", "arguments": {"city": "上海"}}
+</tool_call><|im_end|>
+<|im_start|>user
+<tool_response>
+{"city": "北京", "aqi": "10", "unit": "celsius"}
+</tool_response>
+<tool_response>
+{"city": "上海", "aqi": "72", "unit": "fahrenheit"}
+</tool_response><|im_end|>
+<|im_start|>assistant
+根据天气预报工具，北京今天的空气质量指数为10，属于良好水平；上海今天的空气质量指数为72，属于轻度污染水平。<|im_end|>
+
+[LABELS] [-100 * 195]<tool_call>
+{"name": "realtime_aqi", "arguments": {"city": "北京"}}
+</tool_call>
+<tool_call>
+{"name": "realtime_aqi", "arguments": {"city": "上海"}}
+</tool_call><|im_end|>[-100 * 67]根据天气预报工具，北京今天的空气质量指数为10，属于良好水平；上海今天的空气质量指数为72，属于轻度污染水平。<|im_end|>
+```
+
+样本二：
+```text
+[INPUT_IDS] <|im_start|>system
+You are a helpful assistant.
+
+# Tools
+
+You may call one or more functions to assist with the user query.
+
+You are provided with function signatures within <tools></tools> XML tags:
+<tools>
+{"type": "function", "function": {"name": "click", "description": "点击屏幕中的某个位置", "parameters": {"type": "object", "properties": {"x": {"type": "integer", "description": "横坐标，表示屏幕上的水平位置"}, "y": {"type": "integer", "description": "纵坐标，表示屏幕上的垂直位置"}}, "required": ["x", "y"]}}}
+</tools>
+
+For each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:
+<tool_call>
+{"name": <function-name>, "arguments": <args-json-object>}
+</tool_call><|im_end|>
+<|im_start|>user
+<|vision_start|>[151655 * 729]<|vision_end|>现在几点了？<|im_end|>
+<|im_start|>assistant
+<think>
+我可以通过打开日历App来获取当前时间。
+</think>
+<tool_call>
+{"name": "click", "arguments": {"x": 105, "y": 132}}
+</tool_call><|im_end|>
+<|im_start|>user
+<tool_response>
+{"images": "<|vision_start|>[151655 * 729]<|vision_end|>", "status": "success"}
+</tool_response><|im_end|>
+<|im_start|>assistant
+成功打开日历App，现在的时间为中午11点<|im_end|>
+
+[LABELS] [-100 * 924]<think>
+我可以通过打开日历App来获取当前时间。
+</think>
+<tool_call>
+{"name": "click", "arguments": {"x": 105, "y": 132}}
+</tool_call><|im_end|>[-100 * 759]成功打开日历App，现在的时间为中午11点<|im_end|>
+```
+
+**react_en**也是最常使用的agent template格式，以下为样本一由qwen2_5使用`agent_template='react_en'`进行encode后的input_ids和labels：
+
+```text
+[INPUT_IDS] <|im_start|>system
+Answer the following questions as best you can. You have access to the following tools:
+
+realtime_aqi: Call this tool to interact with the realtime_aqi API. What is the realtime_aqi API useful for? 天气预报。获取实时空气质量。当前空气质量，PM2.5，PM10信息 Parameters: {"type": "object", "properties": {"city": {"type": "string", "description": "城市名，例如：上海"}}, "required": ["city"]} Format the arguments as a JSON object.
+
+Use the following format:
+
+Question: the input question you must answer
+Thought: you should always think about what to do
+Action: the action to take, should be one of [realtime_aqi]
+Action Input: the input to the action
+Observation: the result of the action
+... (this Thought/Action/Action Input/Observation can be repeated zero or more times)
+Thought: I now know the final answer
+Final Answer: the final answer to the original input question
+
+Begin!
+<|im_end|>
+<|im_start|>user
+北京和上海今天的天气情况<|im_end|>
+<|im_start|>assistant
+Action: realtime_aqi
+Action Input: {'city': '北京'}
+Action: realtime_aqi
+Action Input: {'city': '上海'}
+Observation:{"city": "北京", "aqi": "10", "unit": "celsius"}
+Observation:{"city": "上海", "aqi": "72", "unit": "fahrenheit"}
+根据天气预报工具，北京今天的空气质量指数为10，属于良好水平；上海今天的空气质量指数为72，属于轻度污染水平。<|im_end|>
+
+[LABELS] [-100 * 233]Action: realtime_aqi
+Action Input: {'city': '北京'}
+Action: realtime_aqi
+Action Input: {'city': '上海'}
+Observation:[-100 * 45]根据天气预报工具，北京今天的空气质量指数为10，属于良好水平；上海今天的空气质量指数为72，属于轻度污染水平。<|im_end|>
+```
+
+更多的agent template可选值参考[这里](https://github.com/modelscope/swift/blob/main/swift/plugin/agent_template/__init__.py).
+
+
+## tools格式
+tools字段提供了模型可以调用的API信息。你需要提供tools的名字，描述和参数，示例如下：
+
+```python
+tools = [{
+    'type': 'function',
+    'function': {
+        'name': 'get_current_weather',
+        'description': 'Get the current weather in a given location',
+        'parameters': {
+            'type': 'object',
+            'properties': {
+                'location': {
+                    'type': 'string',
+                    'description': 'The city and state, e.g. San Francisco, CA'
+                },
+                'unit': {
+                    'type': 'string',
+                    'enum': ['celsius', 'fahrenheit']
+                }
+            },
+            'required': ['location']
+        }
+    }
+}]
+```
+
+## loss_scale的使用
+
+loss_scale可以对模型输出部分的训练权重进行调节。例如在ReACT格式中，可以设置`--loss_scale react`（loss_scale配置文件书写在[这里](https://github.com/modelscope/swift/blob/main/swift/plugin/loss_scale/config/default_loss_scale_config.json)），该参数起到的作用是：
+
+'Thought:'和'Final Answer:'部分权重为1，'Action:'和'Action Input:'部分权重为2，'Observation:'字段本身权重为2，'Observation:'后面的工具调用结果权重为0。
+
+具体的loss_scale插件设计，请参考[插件化](../Customization/插件化.md)文档.
+
+
+## 训练
+参考[这里](https://github.com/modelscope/ms-swift/tree/main/examples/train/agent)，支持不同模型的丝滑切换。
+
+## 推理
+
+- 原始模型或者全参数训练参考[这里](https://github.com/modelscope/ms-swift/blob/main/examples/infer/demo_agent.py)。
+- LoRA训练参考[这里](https://github.com/modelscope/ms-swift/tree/main/examples/train/agent/loss_scale/infer.md)。
+
+## 部署
+
+参考[这里](https://github.com/modelscope/ms-swift/blob/main/examples/deploy/agent)。