Skip to content

grpo中的async模式是否能够支持tensor_parallel_size>1 #3712

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
kangyishuai opened this issue Mar 28, 2025 · 0 comments
Closed

grpo中的async模式是否能够支持tensor_parallel_size>1 #3712

kangyishuai opened this issue Mar 28, 2025 · 0 comments
Labels
enhancement New feature or request

Comments

@kangyishuai
Copy link

举个例子:
可见显卡数量 = 8
训练显卡数量 = 6
推理显卡数量 = 2
tensor_parallel_size = 2
这样一个参数量大的模型或者token较长的模型才有足够的显存进行推理

@hjh0119 hjh0119 added the enhancement New feature or request label Mar 28, 2025
@hjh0119 hjh0119 closed this as completed Apr 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants