多机多卡情况下GRPO的异步训练问题 #3817

KB-Ding · 2025-04-09T11:37:04Z

Describe the feature
目前GRPO的异步训练只提供了单机示例脚本，但多机情况下有如下问题希望得到解答：

感谢。

Kyrie666 · 2025-04-11T00:49:05Z

同样的问题，目前在训练32B的多模态模型，GRPO这部分使用deepspeed分布式，想单独用一个节点部署模型，其他节点进行训练，能否提供下实例，谢谢 @

kangyishuai · 2025-04-11T03:34:17Z

+1

hjh0119 added the enhancement New feature or request label Apr 11, 2025

hjh0119 mentioned this issue Apr 22, 2025

Decouple vLLM engine and GRPOTrainer. #3911

Merged

4 tasks

hjh0119 closed this as completed Apr 22, 2025

Provide feedback