Skip to content

Fix device mismatch issue in #1071 #1073

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Sep 30, 2022
Merged

Fix device mismatch issue in #1071 #1073

merged 2 commits into from
Sep 30, 2022

Conversation

hudeven
Copy link
Contributor

@hudeven hudeven commented Sep 29, 2022

It's due to image and model are in GPU but target is still in CPU. As the device is already determined in main(), we don't have to do it again in train(). We can pass device to train() and move data to the same device as model.

Test plan:

python main.py -a resnet50 --dist-url tcp://127.0.0.1:1234 --dist-backend nccl --multiprocessing-distributed --world-size 1 --rank 0 --epochs 3 --batch-size 256 -j64 --dummy

image

@netlify
Copy link

netlify bot commented Sep 29, 2022

Deploy Preview for pytorch-examples-preview canceled.

Name Link
🔨 Latest commit c6f817a
🔍 Latest deploy log https://app.netlify.com/sites/pytorch-examples-preview/deploys/633603f95df3d40009472b9e

@hudeven hudeven merged commit f5bb60f into pytorch:main Sep 30, 2022
YinZhengxun pushed a commit to YinZhengxun/mt-exercise-02 that referenced this pull request Mar 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants