We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
3.4版本下如何在sft训练超长文本的大模型下不oom?有样例参数参考吗?
The text was updated successfully, but these errors were encountered:
https://github.com/modelscope/ms-swift/blob/main/examples/train/long_text/zero3.sh
Sorry, something went wrong.
非常感谢你的回复,所以sequence_parallel_size这个参数在最新版本的swift框架中丢弃了是吗?改为use_liger_kernel?用这个liger内核会降低训练速度吗?
是的 丢弃了,不会降低训练速度
不会降低训练速度实在是太棒了!也就是日常的sft和rlhf所有训练都可以加上liger来降低显存是吗?
No branches or pull requests
3.4版本下如何在sft训练超长文本的大模型下不oom?有样例参数参考吗?
The text was updated successfully, but these errors were encountered: