Skip to content

多机多卡情况下GRPO的异步训练问题 #3817

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
KB-Ding opened this issue Apr 9, 2025 · 2 comments
Closed

多机多卡情况下GRPO的异步训练问题 #3817

KB-Ding opened this issue Apr 9, 2025 · 2 comments
Labels
enhancement New feature or request

Comments

@KB-Ding
Copy link

KB-Ding commented Apr 9, 2025

Describe the feature
目前GRPO的异步训练只提供了单机示例脚本,但多机情况下有如下问题希望得到解答:

  1. 多机情况下是否可以开启异步训练,是否有示例脚本?
  2. 多机情况下的vllm推理是否必须与训练共用node?能否实现node1上全部启动vllm,node2全部启动训练?代码中未看到处理这部分的通信。

感谢。

@Kyrie666
Copy link

同样的问题,目前在训练32B的多模态模型,GRPO这部分使用deepspeed分布式,想单独用一个节点部署模型,其他节点进行训练,能否提供下实例,谢谢 @

@kangyishuai
Copy link

+1

@hjh0119 hjh0119 added the enhancement New feature or request label Apr 11, 2025
@hjh0119 hjh0119 closed this as completed Apr 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants