使用qwen2.5-vl训练一个reward model，怎么在命令行中设置RewardConfig/RewardTrainer中需要的参数？ #3916

Rexwangchao · 2025-04-17T09:47:28Z

Describe the bug
RewardTrainer中是支持设置coeffi参数的

但是如何通过swift rlhf --rlhf_type rm ...命令控制这个参数的传入呢？
如果不设置的话，从args.json文件可以看到
json { ..., "training_args": "RewardConfig(...,center_rewards_coefficient=None,...) }

The text was updated successfully, but these errors were encountered:

hjh0119 · 2025-04-17T12:28:29Z

#3917 You can now pass it using --center_rewards_coefficient.

Rexwangchao · 2025-04-18T02:15:10Z

还是不行的。

ps，我用的版本是

Rexwangchao · 2025-04-18T02:19:40Z

不好意思，没注意到你已经fix过了，我更新版本试试 @hjh0119

hjh0119 mentioned this issue Apr 17, 2025

add rm center_rewards_coefficient argument #3917

Merged

4 tasks

hjh0119 closed this as completed Apr 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

使用qwen2.5-vl训练一个reward model，怎么在命令行中设置RewardConfig/RewardTrainer中需要的参数？ #3916

使用qwen2.5-vl训练一个reward model，怎么在命令行中设置RewardConfig/RewardTrainer中需要的参数？ #3916

Rexwangchao commented Apr 17, 2025

hjh0119 commented Apr 17, 2025

Rexwangchao commented Apr 18, 2025

Rexwangchao commented Apr 18, 2025

使用qwen2.5-vl训练一个reward model，怎么在命令行中设置RewardConfig/RewardTrainer中需要的参数？ #3916

使用qwen2.5-vl训练一个reward model，怎么在命令行中设置RewardConfig/RewardTrainer中需要的参数？ #3916

Comments

Rexwangchao commented Apr 17, 2025

hjh0119 commented Apr 17, 2025

Rexwangchao commented Apr 18, 2025

Rexwangchao commented Apr 18, 2025