Template _encode 函数内不能用model.cuda() #4158

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

felixfuu opened this issue May 9, 2025 · 1 comment

Labels

bug

felixfuu commented May 9, 2025

Dataloader内不能用model.cuda()，会报错：
比如：https://github.com/modelscope/ms-swift/blob/main/swift/llm/template/template/emu3.py#L51
报错：RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' method

所以这个脚本是有问题的：
https://github.com/modelscope/ms-swift/blob/main/examples/train/all_to_all/train.sh

Jintao-Huang added the bug label

Author

felixfuu commented May 12, 2025

@Jintao-Huang 把self.processor.vision_tokenizer.encode 移动到_post_encode应该就可以

felixfuu closed this as completed

felixfuu reopened this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment