Skip to content

Commit 84b82a6

Browse files
kadirnarsayakpaulpatrickvonplatenstevhliu
authored
✨ [Core] Add FreeU mechanism (huggingface#5164)
* ✨ Added Fourier filter function to upsample blocks * 🔧 Update Fourier_filter for float16 support * ✨ Added UNetFreeUConfig to UNet model for FreeU adaptation 🛠️ * move unet to its original form and add fourier_filter to torch_utils. * implement freeU enable mechanism * implement disable mechanism * resolution index. * correct resolution idx condition. * fix copies. * no need to use resolution_idx in vae. * spell out the kwargs * proper config property * fix attribution setting * place unet hasattr properly. * fix: attribute access. * proper disable * remove validation method. * debug * debug * debug * debug * debug * debug * potential fix. * add: doc. * fix copies * add: tests. * add: support freeU in SDXL. * set default value of resolution idx. * set default values for resolution_idx. * fix copies * fix rest. * fix copies * address PR comments. * run fix-copies * move apply_free_u to utils and other minors. * introduce support for video (unet3D) * minor ups * consistent fix-copies. * consistent stuff * fix-copies * add: rest * add: docs. * fix: tests * fix: doc path * Apply suggestions from code review Co-authored-by: Steven Liu <[email protected]> * style up * move to techniques. * add: slow test for sd freeu. * add: slow test for sd freeu. * add: slow test for sd freeu. * add: slow test for sd freeu. * add: slow test for sd freeu. * add: slow test for sd freeu. * add: slow test for video with freeu * add: slow test for video with freeu * add: slow test for video with freeu * style --------- Co-authored-by: Sayak Paul <[email protected]> Co-authored-by: Patrick von Platen <[email protected]> Co-authored-by: Steven Liu <[email protected]>
1 parent e46ec5f commit 84b82a6

File tree

14 files changed

+637
-17
lines changed

14 files changed

+637
-17
lines changed

docs/source/en/_toctree.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,8 @@
5858
title: Control image brightness
5959
- local: using-diffusers/weighted_prompts
6060
title: Prompt weighting
61+
- local: using-diffusers/freeu
62+
title: Improve generation quality with FreeU
6163
title: Techniques
6264
- sections:
6365
- local: using-diffusers/pipeline_overview
Lines changed: 123 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,123 @@
1+
# Improve generation quality with FreeU
2+
3+
[[open-in-colab]]
4+
5+
The UNet is responsible for denoising during the reverse diffusion process, and there are two distinct features in its architecture:
6+
7+
1. Backbone features primarily contribute to the denoising process
8+
2. Skip features mainly introduce high-frequency features into the decoder module and can make the network overlook the semantics in the backbone features
9+
10+
However, the skip connection can sometimes introduce unnatural image details. [FreeU](https://hf.co/papers/2309.11497) is a technique for improving image quality by rebalancing the contributions from the UNet’s skip connections and backbone feature maps.
11+
12+
FreeU is applied during inference and it does not require any additional training. The technique works for different tasks such as text-to-image, image-to-image, and text-to-video.
13+
14+
In this guide, you will apply FreeU to the [`StableDiffusionPipeline`], [`StableDiffusionXLPipeline`], and [`TextToVideoSDPipeline`].
15+
16+
## StableDiffusionPipeline
17+
18+
Load the pipeline:
19+
20+
```py
21+
from diffusers import DiffusionPipeline
22+
import torch
23+
24+
pipeline = DiffusionPipeline.from_pretrained(
25+
"runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16, safety_checker=None
26+
).to("cuda")
27+
```
28+
29+
Then enable the FreeU mechanism with the FreeU-specific hyperparameters. These values are scaling factors for the backbone and skip features.
30+
31+
```py
32+
pipeline.enable_freeu(s1=0.9, s2=0.2, b1=1.2, b2=1.4)
33+
```
34+
35+
The values above are from the official FreeU [code repository](https://github.com/ChenyangSi/FreeU) where you can also find [reference hyperparameters](https://github.com/ChenyangSi/FreeU#range-for-more-parameters) for different models.
36+
37+
<Tip>
38+
39+
Disable the FreeU mechanism by calling `disable_freeu()` on a pipeline.
40+
41+
</Tip>
42+
43+
And then run inference:
44+
45+
```py
46+
prompt = "A squirrel eating a burger"
47+
seed = 2023
48+
image = pipeline(prompt, generator=torch.manual_seed(seed)).images[0]
49+
```
50+
51+
The figure below compares non-FreeU and FreeU results respectively for the same hyperparameters used above (`prompt` and `seed`):
52+
53+
![](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/freeu/sdv1_5_freeu.jpg)
54+
55+
56+
Let's see how Stable Diffusion 2 results are impacted:
57+
58+
```py
59+
from diffusers import DiffusionPipeline
60+
import torch
61+
62+
pipeline = DiffusionPipeline.from_pretrained(
63+
"stabilityai/stable-diffusion-2-1", torch_dtype=torch.float16, safety_checker=None
64+
).to("cuda")
65+
66+
prompt = "A squirrel eating a burger"
67+
seed = 2023
68+
69+
pipeline.enable_freeu(s1=0.9, s2=0.2, b1=1.1, b2=1.2)
70+
image = pipeline(prompt, generator=torch.manual_seed(seed)).images[0]
71+
```
72+
73+
74+
![](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/freeu/sdv2_1_freeu.jpg)
75+
76+
## Stable Diffusion XL
77+
78+
Finally, let's take a look at how FreeU affects Stable Diffusion XL results:
79+
80+
```py
81+
from diffusers import DiffusionPipeline
82+
import torch
83+
84+
pipeline = DiffusionPipeline.from_pretrained(
85+
"stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16,
86+
).to("cuda")
87+
88+
prompt = "A squirrel eating a burger"
89+
seed = 2023
90+
91+
# Comes from
92+
# https://wandb.ai/nasirk24/UNET-FreeU-SDXL/reports/FreeU-SDXL-Optimal-Parameters--Vmlldzo1NDg4NTUw
93+
pipeline.enable_freeu(s1=0.6, s2=0.4, b1=1.1, b2=1.2)
94+
image = pipeline(prompt, generator=torch.manual_seed(seed)).images[0]
95+
```
96+
97+
98+
![](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/freeu/sdxl_freeu.jpg)
99+
100+
## Text-to-video generation
101+
102+
FreeU can also be used to improve video quality:
103+
104+
```python
105+
from diffusers import DiffusionPipeline
106+
from diffusers.utils import export_to_video
107+
import torch
108+
109+
model_id = "cerspense/zeroscope_v2_576w"
110+
pipe = DiffusionPipeline.from_pretrained("cerspense/zeroscope_v2_576w", torch_dtype=torch.float16).to("cuda")
111+
pipe = pipe.to("cuda")
112+
113+
prompt = "an astronaut riding a horse on mars"
114+
seed = 2023
115+
116+
# The values come from
117+
# https://github.com/lyn-rgb/FreeU_Diffusers#video-pipelines
118+
pipe.enable_freeu(b1=1.2, b2=1.4, s1=0.9, s2=0.2)
119+
video_frames = pipe(prompt, height=320, width=576, num_frames=30, generator=torch.manual_seed(seed)).frames
120+
export_to_video(video_frames, "astronaut_rides_horse.mp4")
121+
```
122+
123+
Thanks to [kadirnar](https://github.com/kadirnar/) for helping to integrate the feature, and to [justindujardin](https://github.com/justindujardin) for the helpful discussions.

0 commit comments

Comments
 (0)