Skip to content

Template _encode 函数内不能用model.cuda() #4158

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
felixfuu opened this issue May 9, 2025 · 1 comment
Open

Template _encode 函数内不能用model.cuda() #4158

felixfuu opened this issue May 9, 2025 · 1 comment
Labels
bug Something isn't working

Comments

@felixfuu
Copy link

felixfuu commented May 9, 2025

Dataloader内不能用model.cuda(),会报错:
比如:https://github.com/modelscope/ms-swift/blob/main/swift/llm/template/template/emu3.py#L51
报错:RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' method

所以这个脚本是有问题的:
https://github.com/modelscope/ms-swift/blob/main/examples/train/all_to_all/train.sh

@Jintao-Huang Jintao-Huang added the bug Something isn't working label May 10, 2025
@felixfuu
Copy link
Author

@Jintao-Huang 把self.processor.vision_tokenizer.encode 移动到_post_encode应该就可以

@felixfuu felixfuu reopened this May 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants