-
Notifications
You must be signed in to change notification settings - Fork 637
奇怪的out of memory报错 #3964
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
我也遇到了相同的问题,不知道您是否解决了? |
没有解决 |
@Jintao-Huang 麻烦您看下哈,我发现只有 packing 的时候会内存 OOM |
多模态模型嘛,你看看有没有--streaming true |
不是多模态模型,没用流式加载,在packing快结束的时候报oom |
显存还是内存 |
内存oom |
加一下 --streaming true |
@Jintao-Huang 今早看训练发现训练了200多步后又报错了,问题还是没有解决 |
CUDA_VISIBLE_DEVICES=0,1,2
|
代码没有变化,用数据量较少的数据集就可以正常训练,用数据量较多的数据集就报错out of memory,两个数据集除了数据量不同,没有任何差别。为什么?
The text was updated successfully, but these errors were encountered: