Skip to content

[Question]: UIE-X分布式训练显存占用问题 #6604

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
datuizhuang opened this issue Aug 3, 2023 · 1 comment
Closed

[Question]: UIE-X分布式训练显存占用问题 #6604

datuizhuang opened this issue Aug 3, 2023 · 1 comment
Assignees
Labels
question Further information is requested triage

Comments

@datuizhuang
Copy link

请提出你的问题

你好,请问在根据readme中描述的分布式训练UIE-X时,会出现两张卡的显存占用不一致的情况,0卡会比1卡多占用非常多。
(batch size = 2, max_seq_len=512)我看在加载预训练模型阶段,0卡就会用到8700M左右,1卡此时才4000M左右。
训练阶段0卡占用了13000M左右,1卡8800M左右。
请问这个问题如何解决?

@datuizhuang datuizhuang added the question Further information is requested label Aug 3, 2023
@github-actions github-actions bot added the triage label Aug 3, 2023
@w5688414
Copy link
Contributor

w5688414 commented May 7, 2024

请问您的paddle和paddlenlp的坏境是什么,然后使用的是什么显卡

@paddle-bot paddle-bot bot closed this as completed May 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested triage
Projects
None yet
Development

No branches or pull requests

3 participants