diffusers#7426 fix stable diffusion xl inference on MPS when dtypes shift unexpectedly due to pytorch bugs #7446

bghira · 2024-03-23T21:27:19Z

What does this PR do?

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

bghira · 2024-03-23T21:29:20Z

@sayakpaul i included the other pipelines just to keep them consistent even though normally inference does not hit this problem. it is likely nice to fix it for the edge cases where it would

sayakpaul · 2024-03-24T05:17:54Z

src/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl.py

                # compute the previous noisy sample x_t -> x_t-1
+                old_dtype = latents.dtype
                latents = self.scheduler.step(noise_pred, t, latents, **extra_step_kwargs, return_dict=False)[0]
+                if latents.dtype != old_dtype:


Should this be guarded with torch.backends.mps.is_available() as well? It really seems to be happening only when mps is picked up no?

(nit: I'd maybe call it latents_dtype; old sounds almost like it'd be ok if it changes).

Should this be guarded with torch.backends.mps.is_available()

In my opinion, it's ok the way it is as the comment already mentions mps, but no strong opinion.

@sayakpaul @pcuenca it's about the outcome. if any other accelerator behaves in this way, would you rather it crash out of the box so that an issue is filed, or should we just invisibly fix it when we find it?

How about throwing a warning so that the users are at least aware of it?

or raise a specific error asking them to file a report?

i modified the code so that it only executes on mps, as i figured it would be better to have full visibility into any other platform's dtype issues after further consideration.

the current error however, is pretty vague. it says two types are broadcast incompatible, which isn't very helpful to a new user

Yeah raising an error sounds like a better idea. @pcuenca WDYT?

src/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl.py

sayakpaul

To me the changes seem realistic and minimal enough to enable training on a different accelerator. So, I would be in favor of supporting this.

@yiyixuxu @pcuenca WDYT?

HuggingFaceDocBuilderDev · 2024-03-24T05:26:51Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

pcuenca

Agree to add these workarounds to unblock use on mps. Thanks @bghira!

src/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl.py

pcuenca · 2024-03-24T13:10:04Z

src/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl.py

                # compute the previous noisy sample x_t -> x_t-1
+                old_dtype = latents.dtype
                latents = self.scheduler.step(noise_pred, t, latents, **extra_step_kwargs, return_dict=False)[0]
+                if latents.dtype != old_dtype:


(nit: I'd maybe call it latents_dtype; old sounds almost like it'd be ok if it changes).

pcuenca · 2024-03-24T13:11:09Z

src/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl.py

                # compute the previous noisy sample x_t -> x_t-1
+                old_dtype = latents.dtype
                latents = self.scheduler.step(noise_pred, t, latents, **extra_step_kwargs, return_dict=False)[0]
+                if latents.dtype != old_dtype:


Should this be guarded with torch.backends.mps.is_available()

In my opinion, it's ok the way it is as the comment already mentions mps, but no strong opinion.

…ch bug

src/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl.py

sayakpaul · 2024-03-24T15:11:23Z

@yiyixuxu could you also give this a look?

yiyixuxu

thanks!

…ffusion_xl.py Co-authored-by: Sayak Paul <[email protected]>

sayakpaul · 2024-03-26T03:50:25Z

@bghira okay if I applied the safe-guarding logic to the rest of the scripts and prep the PR for merging?

bghira · 2024-03-26T11:56:39Z

yup

sayakpaul · 2024-03-26T12:24:15Z

@bghira give this a final look and I will merge then?

bghira · 2024-03-26T12:25:56Z

nice, lgtm

sayakpaul · 2024-03-26T14:35:43Z

Thanks for your contributions, @bghira!

…hift unexpectedly due to pytorch bugs (huggingface#7446) * mps: fix XL pipeline inference at training time due to upstream pytorch bug * Update src/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl.py Co-authored-by: Sayak Paul <[email protected]> * apply the safe-guarding logic elsewhere. --------- Co-authored-by: bghira <[email protected]> Co-authored-by: Sayak Paul <[email protected]>

…hift unexpectedly due to pytorch bugs (#7446) * mps: fix XL pipeline inference at training time due to upstream pytorch bug * Update src/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl.py Co-authored-by: Sayak Paul <[email protected]> * apply the safe-guarding logic elsewhere. --------- Co-authored-by: bghira <[email protected]> Co-authored-by: Sayak Paul <[email protected]>

bghira force-pushed the bugfix/mps-inference-xl branch from ec67986 to 721dd57 Compare March 23, 2024 21:28

sayakpaul reviewed Mar 24, 2024

View reviewed changes

src/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl.py Outdated Show resolved Hide resolved

sayakpaul approved these changes Mar 24, 2024

View reviewed changes

pcuenca approved these changes Mar 24, 2024

View reviewed changes

bghira force-pushed the bugfix/mps-inference-xl branch from 0a76491 to bc7c7e8 Compare March 24, 2024 13:38

mps: fix XL pipeline inference at training time due to upstream pytor…

bd2a802

…ch bug

bghira force-pushed the bugfix/mps-inference-xl branch from bc7c7e8 to bd2a802 Compare March 24, 2024 13:39

Merge branch 'main' into bugfix/mps-inference-xl

5e793cf

sayakpaul reviewed Mar 24, 2024

View reviewed changes

src/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl.py Outdated Show resolved Hide resolved

sayakpaul requested a review from yiyixuxu March 24, 2024 15:11

yiyixuxu reviewed Mar 26, 2024

View reviewed changes

sayakpaul and others added 3 commits March 26, 2024 07:24

Merge branch 'main' into bugfix/mps-inference-xl

d2e4bd4

Update src/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_di…

279175e

…ffusion_xl.py Co-authored-by: Sayak Paul <[email protected]>

Merge branch 'main' into bugfix/mps-inference-xl

2ece5cf

sayakpaul added 2 commits March 26, 2024 17:32

Merge branch 'main' into bugfix/mps-inference-xl

88bf314

apply the safe-guarding logic elsewhere.

b26601f

Merge branch 'main' into bugfix/mps-inference-xl

623a46d

sayakpaul merged commit 544710e into huggingface:main Mar 26, 2024

bghira deleted the bugfix/mps-inference-xl branch March 26, 2024 16:25

christopher5106 mentioned this pull request Mar 30, 2024

ValueError: For the given accelerator, there seems to be an unexpected problem in type-casting. Please file an issue on the PyTorch GitHub repository. #7529

Closed

diffusers#7426 fix stable diffusion xl inference on MPS when dtypes shift unexpectedly due to pytorch bugs #7446

diffusers#7426 fix stable diffusion xl inference on MPS when dtypes shift unexpectedly due to pytorch bugs #7446

Uh oh!

Conversation

bghira commented Mar 23, 2024

What does this PR do?

Before submitting

Who can review?

Uh oh!

bghira commented Mar 23, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sayakpaul left a comment

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Mar 24, 2024

Uh oh!

pcuenca left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sayakpaul commented Mar 24, 2024

Uh oh!

yiyixuxu left a comment

Choose a reason for hiding this comment

Uh oh!

sayakpaul commented Mar 26, 2024

Uh oh!

bghira commented Mar 26, 2024

Uh oh!

sayakpaul commented Mar 26, 2024

Uh oh!

bghira commented Mar 26, 2024

Uh oh!

sayakpaul commented Mar 26, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants