Skip to content

Refactor Agent Template #3918

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 57 commits into from
Apr 22, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
57 commits
Select commit Hold shift + click to select a range
475611b
update
Jintao-Huang Apr 17, 2025
146003f
Merge branch 'main' into support_more_tool_template
Jintao-Huang Apr 17, 2025
c9e603c
update
Jintao-Huang Apr 17, 2025
055052d
update
Jintao-Huang Apr 17, 2025
dcc88fc
update
Jintao-Huang Apr 17, 2025
e8c5a15
Merge remote-tracking branch 'refs/remotes/origin/support_more_tool_t…
Jintao-Huang Apr 17, 2025
0246ffd
update
Jintao-Huang Apr 18, 2025
3f6c32d
update
Jintao-Huang Apr 18, 2025
665bc02
update
Jintao-Huang Apr 18, 2025
c08414c
update
Jintao-Huang Apr 18, 2025
9ab114e
update
Jintao-Huang Apr 19, 2025
216076f
update
Jintao-Huang Apr 19, 2025
74269dd
update
Jintao-Huang Apr 19, 2025
5bbae0b
update
Jintao-Huang Apr 19, 2025
c624a37
update
Jintao-Huang Apr 19, 2025
d795a3e
update
Jintao-Huang Apr 19, 2025
3dba386
update
Jintao-Huang Apr 19, 2025
a8e5b9f
update
Jintao-Huang Apr 19, 2025
0d51398
update
Jintao-Huang Apr 20, 2025
ed5d5df
update
Jintao-Huang Apr 20, 2025
8bca3c0
update
Jintao-Huang Apr 20, 2025
2febdd4
update
Jintao-Huang Apr 20, 2025
7b9b7c2
update
Jintao-Huang Apr 20, 2025
36bb090
update
Jintao-Huang Apr 20, 2025
eccf401
update
Jintao-Huang Apr 20, 2025
b859c57
Merge branch 'main' into support_more_tool_template
Jintao-Huang Apr 21, 2025
b3b03e2
update
Jintao-Huang Apr 21, 2025
ce55b19
update
Jintao-Huang Apr 21, 2025
1e072f8
update
Jintao-Huang Apr 21, 2025
4e363e5
update shell
Jintao-Huang Apr 21, 2025
f754dc5
update
Jintao-Huang Apr 21, 2025
2198ce0
update
Jintao-Huang Apr 21, 2025
b68fcce
update
Jintao-Huang Apr 21, 2025
61080e3
update
Jintao-Huang Apr 21, 2025
cc69b1e
update
Jintao-Huang Apr 21, 2025
ecab5fc
fix
Jintao-Huang Apr 22, 2025
a537e9e
Merge remote-tracking branch 'refs/remotes/origin/support_more_tool_t…
Jintao-Huang Apr 22, 2025
d4dbbef
update
Jintao-Huang Apr 22, 2025
b1db801
update
Jintao-Huang Apr 22, 2025
cca17fa
update
Jintao-Huang Apr 22, 2025
249f414
update
Jintao-Huang Apr 22, 2025
aa546b7
update
Jintao-Huang Apr 22, 2025
66d5afe
update
Jintao-Huang Apr 22, 2025
6b3e8ac
update
Jintao-Huang Apr 22, 2025
469e083
Merge remote-tracking branch 'refs/remotes/origin/support_more_tool_t…
Jintao-Huang Apr 22, 2025
ddf8735
update
Jintao-Huang Apr 22, 2025
695c032
update docs
Jintao-Huang Apr 22, 2025
4d05199
update
Jintao-Huang Apr 22, 2025
5a4cb1e
update
Jintao-Huang Apr 22, 2025
85823fa
update
Jintao-Huang Apr 22, 2025
9ba9561
update
Jintao-Huang Apr 22, 2025
6013571
update
Jintao-Huang Apr 22, 2025
01b4756
update
Jintao-Huang Apr 22, 2025
e63eeaf
update
Jintao-Huang Apr 22, 2025
2e05535
fix
Jintao-Huang Apr 22, 2025
89a0c26
update
Jintao-Huang Apr 22, 2025
6d84a2c
Merge branch 'main' into support_more_tool_template
Jintao-Huang Apr 22, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion MANIFEST.in
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,4 @@ recursive-include swift/utils *.py
recursive-include swift/llm/dataset/data *.*
recursive-include swift/llm/ds_config *.json
recursive-include requirements *.txt
recursive-include swift/plugin/agent *.json
recursive-include swift/plugin/loss_scale/config *.json
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -126,7 +126,7 @@ Running Environment:
| peft | >=0.11,<0.16 | ||
| trl | >=0.13,<0.17 | 0.16 |RLHF|
| deepspeed | >=0.14 | 0.14.5 | Training |
| vllm | >=0.5.1 | 0.7.3/0.8.3 | Inference/Deployment/Evaluation |
| vllm | >=0.5.1 | 0.7.3/0.8.4 | Inference/Deployment/Evaluation |
| lmdeploy | >=0.5 | 0.7.2.post1 | Inference/Deployment/Evaluation |
| evalscope | >=0.11 | | Evaluation |

Expand Down
2 changes: 1 addition & 1 deletion README_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -121,7 +121,7 @@ pip install -e .
| peft | >=0.11,<0.16 | ||
| trl | >=0.13,<0.17 | 0.16 |RLHF|
| deepspeed | >=0.14 | 0.14.5 |训练|
| vllm | >=0.5.1 | 0.7.3/0.8.3 |推理/部署/评测|
| vllm | >=0.5.1 | 0.7.3/0.8.4 |推理/部署/评测|
| lmdeploy | >=0.5 | 0.7.2.post1 |推理/部署/评测|
| evalscope | >=0.11 | |评测|

Expand Down
Binary file removed docs/resources/image-20240201122323540.png
Binary file not shown.
Binary file removed docs/resources/image-20240201122441874.png
Binary file not shown.
Binary file removed docs/resources/image-20240201122625473.png
Binary file not shown.
Binary file removed docs/resources/image-20240201122725477.png
Binary file not shown.
Binary file removed docs/resources/image-20240201131811038.png
Binary file not shown.
Binary file removed docs/resources/image-20240201132124061.png
Binary file not shown.
Binary file removed docs/resources/image-20240201132139698.png
Binary file not shown.
Binary file removed docs/resources/image-20240201132308260.png
Binary file not shown.
Binary file removed docs/resources/image-20240201132421298.png
Binary file not shown.
Binary file removed docs/resources/image-20240201132454465.png
Binary file not shown.
38 changes: 6 additions & 32 deletions docs/source/Customization/插件化.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ def loss_scale_func(outputs, labels, loss_scale=None, num_items_in_batch=None) -

## 定制化loss_scale

example在[这里](https://github.com/modelscope/swift/blob/main/swift/plugin/loss_scale.py).
example在[这里](https://github.com/modelscope/swift/blob/main/swift/plugin/loss_scale/loss_scale.py).

loss_scale机制在SWIFT中是非常重要的机制之一。在pt和sft任务中,可训练token的loss是均匀的,即每个token平等的进行bp。但在某些情况下,某些token的权重比较大,需要被额外关注,
在这种情况下就需要更高的权重。loss_scale可以让开发者自由地定义自己的token权重。
Expand Down Expand Up @@ -84,6 +84,7 @@ def get_metric(metric: str):
## 定制化optimizer

example在[这里](https://github.com/modelscope/swift/blob/main/swift/plugin/optimizer.py).
- 对模型不同部分采用不同的学习率,例如:ViT和LLM分别使用不同的学习率,参考[这里](https://github.com/modelscope/ms-swift/blob/main/examples/train/multimodal/lora_llm_full_vit/custom_plugin.py)。

用户可以在这里增加自己的optimizer和lr_scheduler实现:
```python
Expand All @@ -103,42 +104,15 @@ optimizers_map = {
```
就可以实际调用了。

## 定制化tools
## 定制化agent template

example在[这里](https://github.com/modelscope/swift/blob/main/swift/plugin/tools.py).

可以在这里定义Agent训练的tools格式。tools格式是指训练和推理时如何将工具枚举在system字段中,例如glm4就有其独特的tools格式:
```python
def format_glm4(tool_names, tool_descs):
GLM4_PROMPT = """你是一个名为 ChatGLM 的人工智能助手。你是基于智谱AI训练的语言模型 GLM-4 模型开发的,你的任务是针对用户的问题和要求提供适当的答复和支持。

# 可用工具

{tool_list}"""
tool_descs = [json.dumps(t) if not isinstance(t, str) else t for t in tool_descs]
tool_list = ''
for name, tool in zip(tool_names, tool_descs):
tool_list += f'## {name}\n\n{tool}\n\n'
return GLM4_PROMPT.format(tool_list=tool_list)
```
system中的完整格式类似于这样:
```text
你是一个名为 ChatGLM 的人工智能助手。你是基于智谱AI训练的语言模型 GLM-4 模型开发的,你的任务是针对用户的问题和要求提供适当的答复和支持。

# 可用工具

## 查看天气

...

## 搜索网络

...
```
example在[这里](https://github.com/modelscope/swift/blob/main/swift/plugin/agent_template).

## 定制化tuner

example在[这里](https://github.com/modelscope/swift/blob/main/swift/plugin/tuner.py).
- 多模态模型对ViT部分使用全参数训练,LLM部分使用LoRA训练,参考[这里](https://github.com/modelscope/ms-swift/tree/main/examples/train/multimodal/lora_llm_full_vit)。
- Phi4-multimodal,直接对其已有LoRA进行训练而不额外附加LoRA,参考[这里](https://github.com/modelscope/ms-swift/blob/main/examples/train/plugins/tuner_phi4_mm.sh)。

tuner定制也是swift中有特色的能力之一,开发者可以无视复杂的tuner初始化流程和代码整合成本,将新的tuner注册在这里:
```python
Expand Down
15 changes: 12 additions & 3 deletions docs/source/Customization/自定义数据集.md
Original file line number Diff line number Diff line change
Expand Up @@ -163,9 +163,18 @@ query-response格式:
```

### Agent格式

Agent格式请参考[Agent文档](../Instruction/智能体的支持.md)。

这里分别提供了纯文本Agent和多模态Agent的示例数据样本:
```jsonl
{"tools": ["{\"type\": \"function\", \"function\": {\"name\": \"realtime_aqi\", \"description\": \"天气预报。获取实时空气质量。当前空气质量,PM2.5,PM10信息\", \"parameters\": {\"type\": \"object\", \"properties\": {\"city\": {\"type\": \"string\", \"description\": \"城市名,例如:上海\"}}, \"required\": [\"city\"]}}}"], "messages": [{"role": "user", "content": "北京和上海今天的天气情况"}, {"role": "tool_call", "content": "{\"name\": \"realtime_aqi\", \"arguments\": {\"city\": \"北京\"}}"}, {"role": "tool_call", "content": "{\"name\": \"realtime_aqi\", \"arguments\": {\"city\": \"上海\"}}"}, {"role": "tool_response", "content": "{\"city\": \"北京\", \"aqi\": \"10\", \"unit\": \"celsius\"}"}, {"role": "tool_response", "content": "{\"city\": \"上海\", \"aqi\": \"72\", \"unit\": \"fahrenheit\"}"}, {"role": "assistant", "content": "根据天气预报工具,北京今天的空气质量指数为10,属于良好水平;上海今天的空气质量指数为72,属于轻度污染水平。"}]}
{"tools": ["{\"type\": \"function\", \"function\": {\"name\": \"click\", \"description\": \"点击屏幕中的某个位置\", \"parameters\": {\"type\": \"object\", \"properties\": {\"x\": {\"type\": \"integer\", \"description\": \"横坐标,表示屏幕上的水平位置\"}, \"y\": {\"type\": \"integer\", \"description\": \"纵坐标,表示屏幕上的垂直位置\"}}, \"required\": [\"x\", \"y\"]}}}"], "messages": [{"role": "user", "content": "<image>现在几点了?"}, {"role": "assistant", "content": "<think>\n我可以通过打开日历App来获取当前时间。\n</think>\n"}, {"role": "tool_call", "content": "{\"name\": \"click\", \"arguments\": {\"x\": 105, \"y\": 132}}"}, {"role": "tool_response", "content": "{\"images\": \"<image>\", \"status\": \"success\"}"}, {"role": "assistant", "content": "成功打开日历App,现在的时间为中午11点"}], "images": ["desktop.png", "calendar.png"]}
```
- agent_template为"react_en", "hermes"等情况下,该格式适配所有模型Agent训练,可以轻松在不同模型间切换。
- 其中tools是一个`List[str]`,其中每一个tool需要是json字符串,messages中role为'tool_call'和'tool_response/tool'的content部分都需要是json字符串。
- tools字段将在训练/推理时和`{"role": "system", ...}"`部分组合,根据agent_template组成完整的system部分。
- `{"role": "tool_call", ...}`部分将根据agent_template自动转成对应格式的`{"role": "assistant", ...}`,多条连续的`{"role": "assistant", ...}`将拼接在一起组成完整的assistant_content。
- `{"role": "tool_response", ...}`也可以写成`{"role": "tool", ...}`,这两种写法是等价的。该部分也将根据`agent_template`自动转换格式。该部分在训练时将不进行损失的计算,角色类似于`{"role": "user", ...}`。
- 该格式支持并行调用工具,例子参考第一条数据样本。多模态Agent数据样本中`<image>`标签数量应与"images"长度相同,其标签位置代表图像特征的插入位置。当然也支持其他模态,例如audios, videos。
- 更多请参考[Agent文档](../Instruction/Agent支持.md)。

## dataset_info.json

Expand Down
2 changes: 1 addition & 1 deletion docs/source/GetStarted/SWIFT安装.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ modelscope-registry.us-west-1.cr.aliyuncs.com/modelscope-repo/modelscope:ubuntu2
| peft | >=0.11,<0.16 | ||
| trl | >=0.13,<0.17 | 0.16 |RLHF|
| deepspeed | >=0.14 | 0.14.5 |训练|
| vllm | >=0.5.1 | 0.7.3/0.8.3 |推理/部署/评测|
| vllm | >=0.5.1 | 0.7.3/0.8.4 |推理/部署/评测|
| lmdeploy | >=0.5 | 0.7.2.post1 |推理/部署/评测|
| evalscope | >=0.11 | |评测|

Expand Down
194 changes: 194 additions & 0 deletions docs/source/Instruction/Agent支持.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,194 @@
# Agent支持

## 数据集格式

纯文本Agent和多模态Agent的示例数据样本如下:
```jsonl
{"tools": ["{\"type\": \"function\", \"function\": {\"name\": \"realtime_aqi\", \"description\": \"天气预报。获取实时空气质量。当前空气质量,PM2.5,PM10信息\", \"parameters\": {\"type\": \"object\", \"properties\": {\"city\": {\"type\": \"string\", \"description\": \"城市名,例如:上海\"}}, \"required\": [\"city\"]}}}"], "messages": [{"role": "user", "content": "北京和上海今天的天气情况"}, {"role": "tool_call", "content": "{\"name\": \"realtime_aqi\", \"arguments\": {\"city\": \"北京\"}}"}, {"role": "tool_call", "content": "{\"name\": \"realtime_aqi\", \"arguments\": {\"city\": \"上海\"}}"}, {"role": "tool_response", "content": "{\"city\": \"北京\", \"aqi\": \"10\", \"unit\": \"celsius\"}"}, {"role": "tool_response", "content": "{\"city\": \"上海\", \"aqi\": \"72\", \"unit\": \"fahrenheit\"}"}, {"role": "assistant", "content": "根据天气预报工具,北京今天的空气质量指数为10,属于良好水平;上海今天的空气质量指数为72,属于轻度污染水平。"}]}
{"tools": ["{\"type\": \"function\", \"function\": {\"name\": \"click\", \"description\": \"点击屏幕中的某个位置\", \"parameters\": {\"type\": \"object\", \"properties\": {\"x\": {\"type\": \"integer\", \"description\": \"横坐标,表示屏幕上的水平位置\"}, \"y\": {\"type\": \"integer\", \"description\": \"纵坐标,表示屏幕上的垂直位置\"}}, \"required\": [\"x\", \"y\"]}}}"], "messages": [{"role": "user", "content": "<image>现在几点了?"}, {"role": "assistant", "content": "<think>\n我可以通过打开日历App来获取当前时间。\n</think>\n"}, {"role": "tool_call", "content": "{\"name\": \"click\", \"arguments\": {\"x\": 105, \"y\": 132}}"}, {"role": "tool_response", "content": "{\"images\": \"<image>\", \"status\": \"success\"}"}, {"role": "assistant", "content": "成功打开日历App,现在的时间为中午11点"}], "images": ["desktop.png", "calendar.png"]}
```
- agent_template为"react_en", "hermes"等情况下,该格式适配所有模型Agent训练,可以轻松在不同模型间切换。
- 其中tools是一个`List[str]`,其中每一个tool需要是json字符串,messages中role为'tool_call'和'tool_response/tool'的content部分都需要是json字符串。
- tools字段将在训练/推理时和`{"role": "system", ...}"`部分组合,根据agent_template组成完整的system部分。
- `{"role": "tool_call", ...}`部分将根据agent_template自动转成对应格式的`{"role": "assistant", ...}`,多条连续的`{"role": "assistant", ...}`将拼接在一起组成完整的assistant_content。
- `{"role": "tool_response", ...}`也可以写成`{"role": "tool", ...}`,这两种写法是等价的。该部分也将根据`agent_template`自动转换格式。该部分在训练时将不进行损失的计算,角色类似于`{"role": "user", ...}`。
- 该格式支持并行调用工具,例子参考第一条数据样本。多模态Agent数据样本中`<image>`标签数量应与"images"长度相同,其标签位置代表图像特征的插入位置。当然也支持其他模态,例如audios, videos。

以下为上述两条数据样本由qwen2_5和qwen2_5_vl的template进行encode后的input_ids和labels,选择的agent_template为**hermes**:

样本一:
```text
[INPUT_IDS] <|im_start|>system
You are Qwen, created by Alibaba Cloud. You are a helpful assistant.

# Tools

You may call one or more functions to assist with the user query.

You are provided with function signatures within <tools></tools> XML tags:
<tools>
{"type": "function", "function": {"name": "realtime_aqi", "description": "天气预报。获取实时空气质量。当前空气质量,PM2.5,PM10信息", "parameters": {"type": "object", "properties": {"city": {"type": "string", "description": "城市名,例如:上海"}}, "required": ["city"]}}}
</tools>

For each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:
<tool_call>
{"name": <function-name>, "arguments": <args-json-object>}
</tool_call><|im_end|>
<|im_start|>user
北京和上海今天的天气情况<|im_end|>
<|im_start|>assistant
<tool_call>
{"name": "realtime_aqi", "arguments": {"city": "北京"}}
</tool_call>
<tool_call>
{"name": "realtime_aqi", "arguments": {"city": "上海"}}
</tool_call><|im_end|>
<|im_start|>user
<tool_response>
{"city": "北京", "aqi": "10", "unit": "celsius"}
</tool_response>
<tool_response>
{"city": "上海", "aqi": "72", "unit": "fahrenheit"}
</tool_response><|im_end|>
<|im_start|>assistant
根据天气预报工具,北京今天的空气质量指数为10,属于良好水平;上海今天的空气质量指数为72,属于轻度污染水平。<|im_end|>

[LABELS] [-100 * 195]<tool_call>
{"name": "realtime_aqi", "arguments": {"city": "北京"}}
</tool_call>
<tool_call>
{"name": "realtime_aqi", "arguments": {"city": "上海"}}
</tool_call><|im_end|>[-100 * 67]根据天气预报工具,北京今天的空气质量指数为10,属于良好水平;上海今天的空气质量指数为72,属于轻度污染水平。<|im_end|>
```

样本二:
```text
[INPUT_IDS] <|im_start|>system
You are a helpful assistant.

# Tools

You may call one or more functions to assist with the user query.

You are provided with function signatures within <tools></tools> XML tags:
<tools>
{"type": "function", "function": {"name": "click", "description": "点击屏幕中的某个位置", "parameters": {"type": "object", "properties": {"x": {"type": "integer", "description": "横坐标,表示屏幕上的水平位置"}, "y": {"type": "integer", "description": "纵坐标,表示屏幕上的垂直位置"}}, "required": ["x", "y"]}}}
</tools>

For each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:
<tool_call>
{"name": <function-name>, "arguments": <args-json-object>}
</tool_call><|im_end|>
<|im_start|>user
<|vision_start|>[151655 * 729]<|vision_end|>现在几点了?<|im_end|>
<|im_start|>assistant
<think>
我可以通过打开日历App来获取当前时间。
</think>
<tool_call>
{"name": "click", "arguments": {"x": 105, "y": 132}}
</tool_call><|im_end|>
<|im_start|>user
<tool_response>
{"images": "<|vision_start|>[151655 * 729]<|vision_end|>", "status": "success"}
</tool_response><|im_end|>
<|im_start|>assistant
成功打开日历App,现在的时间为中午11点<|im_end|>

[LABELS] [-100 * 924]<think>
我可以通过打开日历App来获取当前时间。
</think>
<tool_call>
{"name": "click", "arguments": {"x": 105, "y": 132}}
</tool_call><|im_end|>[-100 * 759]成功打开日历App,现在的时间为中午11点<|im_end|>
```

**react_en**也是最常使用的agent template格式,以下为样本一由qwen2_5使用`agent_template='react_en'`进行encode后的input_ids和labels:

```text
[INPUT_IDS] <|im_start|>system
Answer the following questions as best you can. You have access to the following tools:

realtime_aqi: Call this tool to interact with the realtime_aqi API. What is the realtime_aqi API useful for? 天气预报。获取实时空气质量。当前空气质量,PM2.5,PM10信息 Parameters: {"type": "object", "properties": {"city": {"type": "string", "description": "城市名,例如:上海"}}, "required": ["city"]} Format the arguments as a JSON object.

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [realtime_aqi]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can be repeated zero or more times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin!
<|im_end|>
<|im_start|>user
北京和上海今天的天气情况<|im_end|>
<|im_start|>assistant
Action: realtime_aqi
Action Input: {'city': '北京'}
Action: realtime_aqi
Action Input: {'city': '上海'}
Observation:{"city": "北京", "aqi": "10", "unit": "celsius"}
Observation:{"city": "上海", "aqi": "72", "unit": "fahrenheit"}
根据天气预报工具,北京今天的空气质量指数为10,属于良好水平;上海今天的空气质量指数为72,属于轻度污染水平。<|im_end|>

[LABELS] [-100 * 233]Action: realtime_aqi
Action Input: {'city': '北京'}
Action: realtime_aqi
Action Input: {'city': '上海'}
Observation:[-100 * 45]根据天气预报工具,北京今天的空气质量指数为10,属于良好水平;上海今天的空气质量指数为72,属于轻度污染水平。<|im_end|>
```

更多的agent template可选值参考[这里](https://github.com/modelscope/swift/blob/main/swift/plugin/agent_template/__init__.py).


## tools格式
tools字段提供了模型可以调用的API信息。你需要提供tools的名字,描述和参数,示例如下:

```python
tools = [{
'type': 'function',
'function': {
'name': 'get_current_weather',
'description': 'Get the current weather in a given location',
'parameters': {
'type': 'object',
'properties': {
'location': {
'type': 'string',
'description': 'The city and state, e.g. San Francisco, CA'
},
'unit': {
'type': 'string',
'enum': ['celsius', 'fahrenheit']
}
},
'required': ['location']
}
}
}]
```

## loss_scale的使用

loss_scale可以对模型输出部分的训练权重进行调节。例如在ReACT格式中,可以设置`--loss_scale react`(loss_scale配置文件书写在[这里](https://github.com/modelscope/swift/blob/main/swift/plugin/loss_scale/config/default_loss_scale_config.json)),该参数起到的作用是:

'Thought:'和'Final Answer:'部分权重为1,'Action:'和'Action Input:'部分权重为2,'Observation:'字段本身权重为2,'Observation:'后面的工具调用结果权重为0。

具体的loss_scale插件设计,请参考[插件化](../Customization/插件化.md)文档.


## 训练
参考[这里](https://github.com/modelscope/ms-swift/tree/main/examples/train/agent),支持不同模型的丝滑切换。

## 推理

- 原始模型或者全参数训练参考[这里](https://github.com/modelscope/ms-swift/blob/main/examples/infer/demo_agent.py)。
- LoRA训练参考[这里](https://github.com/modelscope/ms-swift/tree/main/examples/train/agent/loss_scale/infer.md)。

## 部署

参考[这里](https://github.com/modelscope/ms-swift/blob/main/examples/deploy/agent)。
Loading
Loading