Skip to content

3.4版本的sequence_parallel 被丢弃了吗? #4043

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
leileilin opened this issue Apr 29, 2025 · 4 comments
Closed

3.4版本的sequence_parallel 被丢弃了吗? #4043

leileilin opened this issue Apr 29, 2025 · 4 comments

Comments

@leileilin
Copy link

3.4版本下如何在sft训练超长文本的大模型下不oom?有样例参数参考吗?

@Jintao-Huang
Copy link
Collaborator

@leileilin
Copy link
Author

https://github.com/modelscope/ms-swift/blob/main/examples/train/long_text/zero3.sh

非常感谢你的回复,所以sequence_parallel_size这个参数在最新版本的swift框架中丢弃了是吗?改为use_liger_kernel?用这个liger内核会降低训练速度吗?

@Jintao-Huang
Copy link
Collaborator

是的 丢弃了,不会降低训练速度

@leileilin
Copy link
Author

是的 丢弃了,不会降低训练速度

不会降低训练速度实在是太棒了!也就是日常的sft和rlhf所有训练都可以加上liger来降低显存是吗?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants