[SD-XL] Add new pipelines #3859

patrickvonplaten · 2023-06-23T08:38:25Z

Usage for `"stabilityai/stable-diffusion-xl-base-0.9"`:

pip install git+https://github.com/huggingface/diffusers.git@sd_xl

In addition make sure to install transformers, safetensors, accelerate as well as the invisible watermark:

pip install transformers accelerate safetensors

pip install "numpy>=1.17" "PyWavelets>=1.1.1" "opencv-python>=4.1.0.25"
pip install --no-deps invisible-watermark

You can use the model then as follows

from diffusers import DiffusionPipeline
import torch

pipe = DiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-xl-base-0.9", torch_dtype=torch.float16, use_safetensors=True, variant="fp16")
pipe.to("cuda")

# if using torch < 2.0
# pipe.enable_xformers_memory_efficient_attention()

prompt = "An astronaut riding a green horse"

images = pipe(prompt=prompt).images[0]

When using torch >= 2.0, you can improve the inference speed by 20-30% with torch.compile. Simple wrap the unet with torch compile before running the pipeline:

pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True)

If you are limited by GPU VRAM, you can enable cpu offloading by calling pipe.enable_model_cpu_offload
instead of .to("cuda"):

- pipe.to("cuda")
+ pipe.enable_model_cpu_offload()

Usage for `"stabilityai/stable-diffusion-xl-refiner-0.9"`

from diffusers import DiffusionPipeline
import torch

pipe = DiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-xl-base-0.9", torch_dtype=torch.float16, use_safetensors=True, variant="fp16")
pipe.to("cuda")

# if using torch < 2.0
# pipe.enable_xformers_memory_efficient_attention()

prompt = "An astronaut riding a green horse"

images = pipe(prompt=prompt, output_type="latent").images

pipe = DiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-xl-refiner-0.9", torch_dtype=torch.float16, use_safetensors=True, variant="fp16")
pipe.enable_model_cpu_offload()

# if using torch < 2.0
# pipe.enable_xformers_memory_efficient_attention()

images = pipe(prompt=prompt, image=image).images

When using torch >= 2.0, you can improve the inference speed by 20-30% with torch.compile. Simple wrap the unet with torch compile before running the pipeline:

pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True)

If you are limited by GPU VRAM, you can enable cpu offloading by calling pipe.enable_model_cpu_offload
instead of .to("cuda"):

- pipe.to("cuda")
+ pipe.enable_model_cpu_offload()

HuggingFaceDocBuilderDev · 2023-06-23T08:44:23Z

The documentation is not available anymore as the PR was closed or merged.

patrickvonplaten · 2023-06-26T16:06:27Z

src/diffusers/schedulers/scheduling_euler_discrete.py

+        if self.config.timestep_spacing == "linspace":
+            timesteps = np.linspace(0, self.config.num_train_timesteps - 1, num_inference_steps, dtype=float)[::-1].copy()
+        elif self.config.timestep_spacing == "leading":
+            step_ratio = self.config.num_train_timesteps // self.num_inference_steps


this new spacing, doesn't give drastically better results, but better results nevertheless IMO. It's also needed to have 1-to-1 the same results as original code.

Does the original code (XL) use this new spacing scheme, though?

sayakpaul · 2023-06-27T03:37:53Z

src/diffusers/models/unet_2d_condition.py

+        num_transformer_blocks (`int` or `Tuple[int]`, *optional*, defaults to 1):
+            The number of transformer blocks of type [`~models.attention.BasicTransformerBlock`]. Only relevant for [`~models.unet_2d_blocks.CrossAttnDownBlock2D`], [`~models.unet_2d_blocks.CrossAttnUpBlock2D`], [`~models.unet_2d_blocks.UNetMidBlock2DCrossAttn`].


So a Transformer block can be a UNet block? I don't find the num_transformer_blocks name to be a good one to encompass all the blocks we're supporting here. But cannot think of a better one, either. So, okay to ignore I guess.

Yeah good point, maybe transformer_layers_per_block is better?

Better for sure.

sayakpaul · 2023-06-27T03:41:43Z

src/diffusers/pipelines/stable_diffusion/convert_from_ckpt.py

-def convert_open_clip_checkpoint(checkpoint):
-    text_model = CLIPTextModel.from_pretrained("stabilityai/stable-diffusion-2", subfolder="text_encoder")
+def convert_open_clip_checkpoint(checkpoint, prefix="cond_stage_model.model."):
+    # text_model = CLIPTextModel.from_pretrained("stabilityai/stable-diffusion-2", subfolder="text_encoder")


Are we not affecting the SD 2 conversion process with this one?

Need to double check!

sayakpaul · 2023-06-27T03:42:31Z

src/diffusers/pipelines/stable_diffusion/convert_from_ckpt.py

+    num_train_timesteps = original_config.model.params.timesteps or 1000
+    beta_start = original_config.model.params.linear_start or 0.02
+    beta_end = original_config.model.params.linear_end or 0.085


Where are these numbers coming from? I'd make a note for our future reference.

Ah this is hacky for now and shouldn't be this way

… sd_xl

patrickvonplaten · 2023-06-27T19:33:11Z

src/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl.py

+        text_encoder_lora_scale = (
+            cross_attention_kwargs.get("scale", None) if cross_attention_kwargs is not None else None
+        )
+        (


4 tensors are returned instead of just one.
The first 2 tensors are the normal pos and neg prompt embeddings that are passed into cross attention. The last 2 "pooled" embeds are used to additional condition the time embedding

Fix embeddings for classic SD models.

patrickvonplaten · 2023-06-28T11:04:09Z

src/diffusers/schedulers/scheduling_euler_discrete.py

@@ -107,6 +107,13 @@ class EulerDiscreteScheduler(SchedulerMixin, ConfigMixin):
             This parameter controls whether to use Karras sigmas (Karras et al. (2022) scheme) for step sizes in the
             noise schedule during the sampling process. If True, the sigmas will be determined according to a sequence
             of noise levels {σi} as defined in Equation (5) of the paper https://arxiv.org/pdf/2206.00364.pdf.
+        timestep_spacing (`str`, default `"linspace"`):


Those changes should also work well for other schedulers

src/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl_img2img.py

bghira · 2023-07-06T02:43:48Z

src/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl_img2img.py

+                A kwargs dictionary that if specified is passed along to the `AttentionProcessor` as defined under
+                `self.processor` in
+                [diffusers.cross_attention](https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/cross_attention.py).
+            guidance_rescale (`float`, *optional*, defaults to 0.7):


defaults to 0.0*

Ah yes we should probably fix this in a follow-up PR! Sorry just noticed the comment here. Would you like to open a PR here maybe @bghira ? :-)

This reverts commit 491bc9f.

… sd_xl

ValMystletainn · 2023-07-07T07:21:44Z

when install i got this error

Collecting git+https://github.com/huggingface/diffusers.git@sd_xl Cloning https://github.com/huggingface/diffusers.git (to revision sd_xl) to /tmp/pip-req-build-5g66rp30 Running command git clone --filter=blob:none --quiet https://github.com/huggingface/diffusers.git /tmp/pip-req-build-5g66rp30 WARNING: Did not find branch or tag 'sd_xl', assuming revision or ref. Running command git checkout -q sd_xl error: pathspec 'sd_xl' did not match any file(s) known to git error: subprocess-exited-with-error

× git checkout -q sd_xl did not run successfully. │ exit code: 1 ╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip. error: subprocess-exited-with-error

× git checkout -q sd_xl did not run successfully. │ exit code: 1 ╰─> See above for output.

he just merge the feature branch to main and delete it. And the documentation in model space is not updated in time, just install the diffuser from main branch rather than sd_xl branch and keep all other things same.

kaddly · 2023-07-07T07:42:28Z

when install i got this error
Collecting git+https://github.com/huggingface/diffusers.git@sd_xl Cloning https://github.com/huggingface/diffusers.git (to revision sd_xl) to /tmp/pip-req-build-5g66rp30 Running command git clone --filter=blob:none --quiet https://github.com/huggingface/diffusers.git /tmp/pip-req-build-5g66rp30 WARNING: Did not find branch or tag 'sd_xl', assuming revision or ref. Running command git checkout -q sd_xl error: pathspec 'sd_xl' did not match any file(s) known to git error: subprocess-exited-with-error
× git checkout -q sd_xl did not run successfully. │ exit code: 1 ╰─> See above for output.
note: This error originates from a subprocess, and is likely not a problem with pip. error: subprocess-exited-with-error
× git checkout -q sd_xl did not run successfully. │ exit code: 1 ╰─> See above for output.

he just merge the feature branch to main and delete it. And the documentation in model space is not updated in time, just install the diffuser from main branch rather than sd_xl branch and keep all other things same.

thanks

* Add new text encoder * add transformers depth * More * Correct conversion script * Fix more * Fix more * Correct more * correct text encoder * Finish all * proof that in works in run local xl * clean up * Get refiner to work * Add red castle * Fix batch size * Improve pipelines more * Finish text2image tests * Add img2img test * Fix more * fix import * Fix embeddings for classic models (huggingface#3888) Fix embeddings for classic SD models. * Allow multiple prompts to be passed to the refiner (huggingface#3895) * finish more * Apply suggestions from code review * add watermarker * Model offload (huggingface#3889) * Model offload. * Model offload for refiner / img2img * Hardcode encoder offload on img2img vae encode Saves some GPU RAM in img2img / refiner tasks so it remains below 8 GB. --------- Co-authored-by: Patrick von Platen <[email protected]> * correct * fix * clean print * Update install warning for `invisible-watermark` * add: missing docstrings. * fix and simplify the usage example in img2img. * fix setup for watermarking. * Revert "fix setup for watermarking." This reverts commit 491bc9f. * fix: watermarking setup. * fix: op. * run make fix-copies. * make sure tests pass * improve convert * make tests pass * make tests pass * better error message * fiinsh * finish * Fix final test --------- Co-authored-by: Pedro Cuenca <[email protected]> Co-authored-by: Sayak Paul <[email protected]>

Add new text encoder

57b8406

patrickvonplaten added 13 commits June 23, 2023 12:16

add transformers depth

39b0b97

More

50df26c

Correct conversion script

4309a2c

Fix more

51ab97a

Fix more

dd48802

Correct more

7b76780

correct text encoder

e0a0e36

Finish all

277bc9d

proof that in works in run local xl

62a151d

clean up

ea4cf25

Get refiner to work

48d203e

Add red castle

4216826

Fix batch size

13107bb

patrickvonplaten commented Jun 26, 2023

View reviewed changes

sayakpaul reviewed Jun 27, 2023

View reviewed changes

sayakpaul mentioned this pull request Jun 27, 2023

[WIP] [Examples] SD XL DreamBooth LoRA #3877

Closed

5 tasks

patrickvonplaten added 6 commits June 27, 2023 15:06

Improve pipelines more

cb23c61

Finish text2image tests

0f1d17c

Add img2img test

7850ef3

Fix more

a0621fd

fix import

60cea8e

Merge branch 'sd_xl' of https://github.com/huggingface/diffusers into…

6217c36

… sd_xl

patrickvonplaten changed the title ~~[SD-XL, WIP] Add new text encoder~~ [SD-XL] Add new pipelines Jun 27, 2023

patrickvonplaten commented Jun 27, 2023

View reviewed changes

Fix embeddings for classic models (#3888)

fb7ee3a

Fix embeddings for classic SD models.

patrickvonplaten commented Jun 28, 2023

View reviewed changes

Update install warning for invisible-watermark

045fc0d

patrickvonplaten mentioned this pull request Jul 3, 2023

[Don't merge] PR to show that diffusers and SGM gives 1-to-1 the same result Stability-AI/generative-models#23

Closed

bghira reviewed Jul 5, 2023

View reviewed changes

src/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl_img2img.py Show resolved Hide resolved

bghira reviewed Jul 6, 2023

View reviewed changes

sayakpaul and others added 18 commits July 6, 2023 08:54

merge main and resolve conflicts.

c7884c5

add: missing docstrings.

e21e83b

fix and simplify the usage example in img2img.

9b918eb

fix setup for watermarking.

491bc9f

Revert "fix setup for watermarking."

7525786

This reverts commit 491bc9f.

fix: watermarking setup.

fd2af23

fix: op.

75381ed

run make fix-copies.

cefee41

make sure tests pass

9bc7eab

Merge branch 'main' into sd_xl

a97ce2e

improve convert

75f26d6

make tests pass

e6a1381

Merge branch 'sd_xl' of https://github.com/huggingface/diffusers into…

6002919

… sd_xl

make tests pass

d9296c5

better error message

55ebe05

fiinsh

46f515d

finish

6ad5005

Fix final test

bc35818

patrickvonplaten merged commit bc9a8ce into main Jul 6, 2023

will mentioned this pull request Jul 6, 2023

Gradio is not auto downloading Hugging Face model weights - auto downloading on Google Colab but not on Windows #3963

Closed

patrickvonplaten deleted the sd_xl branch July 6, 2023 11:44

TonyLianLong mentioned this pull request Jul 12, 2023

Question about the refiner step TonyLianLong/stable-diffusion-xl-demo#2

Open

		num_transformer_blocks (`int` or `Tuple[int]`, optional, defaults to 1):
		The number of transformer blocks of type [`~models.attention.BasicTransformerBlock`]. Only relevant for [`~models.unet_2d_blocks.CrossAttnDownBlock2D`], [`~models.unet_2d_blocks.CrossAttnUpBlock2D`], [`~models.unet_2d_blocks.UNetMidBlock2DCrossAttn`].

[SD-XL] Add new pipelines #3859

[SD-XL] Add new pipelines #3859

Uh oh!

Conversation

patrickvonplaten commented Jun 23, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Usage for "stabilityai/stable-diffusion-xl-base-0.9":

Usage for "stabilityai/stable-diffusion-xl-refiner-0.9"

Uh oh!

HuggingFaceDocBuilderDev commented Jun 23, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ValMystletainn commented Jul 7, 2023

Uh oh!

kaddly commented Jul 7, 2023

Uh oh!

Uh oh!

patrickvonplaten commented Jun 23, 2023 •

edited

Loading

Usage for `"stabilityai/stable-diffusion-xl-base-0.9"`:

Usage for `"stabilityai/stable-diffusion-xl-refiner-0.9"`

HuggingFaceDocBuilderDev commented Jun 23, 2023 •

edited

Loading