Support for manual CLIP loading in StableDiffusionPipeline - txt2img. #3832

WadRex · 2023-06-20T14:58:30Z

What does this PR do?

This pull request introduces a feature that enhances the loading mechanism of the CLIP model when used in conjunction with StableDiffusionPipeline.from_ckpt(). This PR affects only the txt2img part of the mentioned pipeline. Users now have the flexibility to manually load the CLIP model and tokenizer, thereby bypassing the force loading behavior. This PR resolves the challenge of creating a fully portable solution, as the CLIP model and tokenizer would previously end up in the cache.
With this enhancement, users can now specify their desired CLIP model and tokenizer location as follows:

# Users have the option to choose the official Hugging Face repository: `openai/clip-vit-large-patch14`,
# or provide the local path to load CLIP from.
clip_text_model = CLIPTextModel.from_pretrained("repo/id/or/path/to/local/folder")
clip_tokenizer = CLIPTokenizer.from_pretrained("repo/id/or/path/to/local/folder")

# Subsequently, these parameters are passed into the `StableDiffusionPipeline` function.
pipeline = StableDiffusionPipeline.from_ckpt("path/to/single/safetensors/or/bin/file", 
                                              clip_text_model=clip_text_model,
                                              clip_tokenizer=clip_tokenizer)

It is important to note that this pull request does not impact the behavior if users choose not to provide any parameters for clip_text_model or clip_tokenizer. In such cases, the code will behave as it did before the PR.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a Github issue or the forum? Please add a link to it if that's the case. => Discussed in StableDiffusionPipeline() and CLIP "cooperation" #3822
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@patrickvonplaten
@sayakpaul

src/diffusers/pipelines/stable_diffusion/convert_from_ckpt.py

patrickvonplaten

Looks great to me! @sayakpaul wdyt?

HuggingFaceDocBuilderDev · 2023-06-21T11:11:04Z

The documentation is not available anymore as the PR was closed or merged.

sayakpaul · 2023-06-22T03:30:33Z

src/diffusers/loaders.py

+            clip_text_model (`transformers.models.clip.modeling_clip.CLIPTextModel`, *optional*, defaults to `None`):
+                An instance of `CLIPTextModel` to use. If this parameter is `None`, the function will load a new instance of `CLIPTextModel`, if needed.
+            clip_tokenizer (`transformers.models.clip.tokenization_clip.CLIPTokenizer`, *optional*, defaults to `None`):
+                An instance of `CLIPTokenizer` to use. If this parameter is `None`, the function will load a new instance of `CLIPTokenizer`, if needed.


Could we maybe follow something similar to for the docstrings here?

diffusers/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py

Line 90 in 0bab447

text_encoder ([`CLIPTextModel`]):

sayakpaul · 2023-06-22T03:33:50Z

src/diffusers/pipelines/stable_diffusion/convert_from_ckpt.py

+        clip_text_model (`transformers.models.clip.modeling_clip.CLIPTextModel`, *optional*, defaults to `None`):
+            An instance of `CLIPTextModel` to use. If this parameter is `None`, the function will load a new instance of `CLIPTextModel`, if needed.
+        clip_tokenizer (`transformers.models.clip.tokenization_clip.CLIPTokenizer`, *optional*, defaults to `None`):
+            An instance of `CLIPTokenizer` to use. If this parameter is `None`, the function will load a new instance of `CLIPTokenizer`, if needed.


Same as above.

sayakpaul

Thanks a lot for this important PR! Left some nits.

sayakpaul · 2023-06-22T03:36:03Z

Regarding the failing test here, could you run make style && make quality from your environment? More info is available on: https://github.com/huggingface/diffusers/blob/main/CONTRIBUTING.md

WadRex · 2023-06-24T20:15:59Z

@patrickvonplaten
@sayakpaul
Are there any more requirements or changes needed to successfully merge this PR?

…huggingface#3832) * Support for manual CLIP loading in StableDiffusionPipeline - txt2img. * Update src/diffusers/pipelines/stable_diffusion/convert_from_ckpt.py * Update variables & according docs to match previous style. * Updated to match style & quality of 'diffusers' --------- Co-authored-by: Patrick von Platen <[email protected]>

Support for manual CLIP loading in StableDiffusionPipeline - txt2img.

17d08c0

monorimet mentioned this pull request Jun 20, 2023

CLIP error on build 761 txt2img nod-ai/SHARK-Studio#1519

Open

patrickvonplaten reviewed Jun 21, 2023

View reviewed changes

src/diffusers/pipelines/stable_diffusion/convert_from_ckpt.py Outdated Show resolved Hide resolved

patrickvonplaten approved these changes Jun 21, 2023

View reviewed changes

Update src/diffusers/pipelines/stable_diffusion/convert_from_ckpt.py

c2fdc7c

Merge branch 'main' into fix-clip-forceload-txt2img

12f2470

sayakpaul reviewed Jun 22, 2023

View reviewed changes

sayakpaul approved these changes Jun 22, 2023

View reviewed changes

WadRex added 3 commits June 22, 2023 20:04

Merge branch 'main' into fix-clip-forceload-txt2img

1db0bf7

Update variables & according docs to match previous style.

5ae9ddd

Updated to match style & quality of 'diffusers'

9028a8c

patrickvonplaten merged commit 1500130 into huggingface:main Jun 28, 2023

WadRex deleted the fix-clip-forceload-txt2img branch July 8, 2023 19:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support for manual CLIP loading in StableDiffusionPipeline - txt2img. #3832

Support for manual CLIP loading in StableDiffusionPipeline - txt2img. #3832

Uh oh!

WadRex commented Jun 20, 2023 •

edited

Loading

Uh oh!

Uh oh!

patrickvonplaten left a comment

Uh oh!

HuggingFaceDocBuilderDev commented Jun 21, 2023 •

edited

Loading

Uh oh!

sayakpaul Jun 22, 2023

Uh oh!

sayakpaul Jun 22, 2023

Uh oh!

sayakpaul left a comment

Uh oh!

sayakpaul commented Jun 22, 2023

Uh oh!

WadRex commented Jun 24, 2023

Uh oh!

Uh oh!

Support for manual CLIP loading in StableDiffusionPipeline - txt2img. #3832

Support for manual CLIP loading in StableDiffusionPipeline - txt2img. #3832

Uh oh!

Conversation

WadRex commented Jun 20, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Who can review?

Uh oh!

Uh oh!

patrickvonplaten left a comment

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Jun 21, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sayakpaul Jun 22, 2023

Choose a reason for hiding this comment

Uh oh!

sayakpaul Jun 22, 2023

Choose a reason for hiding this comment

Uh oh!

sayakpaul left a comment

Choose a reason for hiding this comment

Uh oh!

sayakpaul commented Jun 22, 2023

Uh oh!

WadRex commented Jun 24, 2023

Uh oh!

Uh oh!

WadRex commented Jun 20, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Jun 21, 2023 •

edited

Loading