Skip to content

Commit a4d5b59

Browse files
patrickvonplatenNathan Lambertpcuencaanton-lpatil-suraj
authored
Refactor Pipelines / Community pipelines and add better explanations. (huggingface#257)
* [Examples readme] * Improve * more * save * save * save more * up * up * Apply suggestions from code review Co-authored-by: Nathan Lambert <[email protected]> Co-authored-by: Pedro Cuenca <[email protected]> * up * make deterministic * up * better * up * add generator to img2img pipe * save * make pipelines deterministic * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_inpaint.py Co-authored-by: Anton Lozhkov <[email protected]> * apply all changes * more correctnios * finish * improve table * more fixes * up * Apply suggestions from code review Co-authored-by: Suraj Patil <[email protected]> Co-authored-by: Pedro Cuenca <[email protected]> * Apply suggestions from code review Co-authored-by: Suraj Patil <[email protected]> * Apply suggestions from code review Co-authored-by: Suraj Patil <[email protected]> * Apply suggestions from code review Co-authored-by: Pedro Cuenca <[email protected]> Co-authored-by: Suraj Patil <[email protected]> Co-authored-by: Anton Lozhkov <[email protected]> * Update src/diffusers/pipelines/README.md Co-authored-by: Suraj Patil <[email protected]> * add better links * fix more * finish Co-authored-by: Nathan Lambert <[email protected]> Co-authored-by: Pedro Cuenca <[email protected]> Co-authored-by: Anton Lozhkov <[email protected]> Co-authored-by: Suraj Patil <[email protected]>
1 parent 5e84353 commit a4d5b59

File tree

19 files changed

+807
-514
lines changed

19 files changed

+807
-514
lines changed

README.md

Lines changed: 95 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -20,29 +20,30 @@ as a modular toolbox for inference and training of diffusion models.
2020

2121
More precisely, 🤗 Diffusers offers:
2222

23-
- State-of-the-art diffusion pipelines that can be run in inference with just a couple of lines of code (see [src/diffusers/pipelines](https://github.com/huggingface/diffusers/tree/main/src/diffusers/pipelines)).
23+
- State-of-the-art diffusion pipelines that can be run in inference with just a couple of lines of code (see [src/diffusers/pipelines](https://github.com/huggingface/diffusers/tree/main/src/diffusers/pipelines)). Check [this overview](https://github.com/huggingface/diffusers/tree/main/src/diffusers/pipelines/README.md#pipelines-summary) to see all supported pipelines and their corresponding official papers.
2424
- Various noise schedulers that can be used interchangeably for the prefered speed vs. quality trade-off in inference (see [src/diffusers/schedulers](https://github.com/huggingface/diffusers/tree/main/src/diffusers/schedulers)).
2525
- Multiple types of models, such as UNet, can be used as building blocks in an end-to-end diffusion system (see [src/diffusers/models](https://github.com/huggingface/diffusers/tree/main/src/diffusers/models)).
2626
- Training examples to show how to train the most popular diffusion models (see [examples/training](https://github.com/huggingface/diffusers/tree/main/examples/training)).
2727
- Inference examples to show how to create custom pipelines for advanced tasks such as image2image, in-painting (see [examples/inference](https://github.com/huggingface/diffusers/tree/main/examples/inference))
2828

29+
2930
## Quickstart
3031

3132
In order to get started, we recommend taking a look at two notebooks:
3233

3334
- The [Getting started with Diffusers](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/diffusers_intro.ipynb) [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/diffusers_intro.ipynb) notebook, which showcases an end-to-end example of usage for diffusion models, schedulers and pipelines.
3435
Take a look at this notebook to learn how to use the pipeline abstraction, which takes care of everything (model, scheduler, noise handling) for you, and also to understand each independent building block in the library.
35-
- The [Training a diffusers model](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/training_example.ipynb) [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/training_example.ipynb) notebook summarizes diffuser model training methods. This notebook takes a step-by-step approach to training your
36-
diffuser model on an image dataset, with explanatory graphics.
36+
- The [Training a diffusers model](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/training_example.ipynb) [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/training_example.ipynb) notebook summarizes diffusion models training methods. This notebook takes a step-by-step approach to training your
37+
diffusion models on an image dataset, with explanatory graphics.
3738

38-
## **New 🎨🎨🎨** Stable Diffusion is now fully compatible with `diffusers`!
39-
40-
Stable Diffusion is a text-to-image latent diffusion model created by the researchers and engineers from [CompVis](https://github.com/CompVis), [Stability AI](https://stability.ai/) and [LAION](https://laion.ai/). It's trained on 512x512 images from a subset of the [LAION-5B](https://laion.ai/blog/laion-5b/) database. This model uses a frozen CLIP ViT-L/14 text encoder to condition the model on text prompts. With its 860M UNet and 123M text encoder, the model is relatively lightweight and runs on a GPU with at least 10GB VRAM.
39+
## **New ** Stable Diffusion is now fully compatible with `diffusers`! Stable Diffusion is a text-to-image latent diffusion model created by the researchers and engineers from [CompVis](https://github.com/CompVis), [Stability AI](https://stability.ai/) and [LAION](https://laion.ai/). It's trained on 512x512 images from a subset of the [LAION-5B](https://laion.ai/blog/laion-5b/) database. This model uses a frozen CLIP ViT-L/14 text encoder to condition the model on text prompts. With its 860M UNet and 123M text encoder, the model is relatively lightweight and runs on a GPU with at least 10GB VRAM.
4140
See the [model card](https://huggingface.co/CompVis/stable-diffusion) for more information.
4241

4342
You need to accept the model license before downloading or using the Stable Diffusion weights. Please, visit the [model card](https://huggingface.co/CompVis/stable-diffusion-v1-3), read the license and tick the checkbox if you agree. You have to be a registered user in 🤗 Hugging Face Hub, and you'll also need to use an access token for the code to work. For more information on access tokens, please refer to [this section](https://huggingface.co/docs/hub/security-tokens) of the documentation.
4443

45-
```py
44+
### Text-to-Image generation with Stable Diffusion
45+
46+
```python
4647
# make sure you're logged in with `huggingface-cli login`
4748
from torch import autocast
4849
from diffusers import StableDiffusionPipeline, LMSDiscreteScheduler
@@ -54,10 +55,13 @@ lms = LMSDiscreteScheduler(
5455
)
5556

5657
pipe = StableDiffusionPipeline.from_pretrained(
57-
"CompVis/stable-diffusion-v1-3",
58+
"CompVis/stable-diffusion-v1-4",
59+
revision="fp16",
60+
torch_dtype=torch.float16,
5861
scheduler=lms,
5962
use_auth_token=True
60-
).to("cuda")
63+
)
64+
pipe = pipe.to("cuda")
6165

6266
prompt = "a photo of an astronaut riding a horse on mars"
6367
with autocast("cuda"):
@@ -66,6 +70,88 @@ with autocast("cuda"):
6670
image.save("astronaut_rides_horse.png")
6771
```
6872

73+
### Image-to-Image text-guided generation with Stable Diffusion
74+
75+
The `StableDiffusionImg2ImgPipeline` lets you pass a text prompt and an initial image to condition the generation of new images.
76+
77+
```python
78+
from torch import autocast
79+
import requests
80+
from PIL import Image
81+
from io import BytesIO
82+
83+
from diffusers import StableDiffusionImg2ImgPipeline
84+
85+
# load the pipeline
86+
device = "cuda"
87+
pipe = StableDiffusionImg2ImgPipeline.from_pretrained(
88+
"CompVis/stable-diffusion-v1-4",
89+
revision="fp16",
90+
torch_dtype=torch.float16,
91+
use_auth_token=True
92+
)
93+
pipe = pipe.to(device)
94+
95+
# let's download an initial image
96+
url = "https://raw.githubusercontent.com/CompVis/stable-diffusion/main/assets/stable-samples/img2img/sketch-mountains-input.jpg"
97+
98+
response = requests.get(url)
99+
init_image = Image.open(BytesIO(response.content)).convert("RGB")
100+
init_image = init_image.resize((768, 512))
101+
102+
prompt = "A fantasy landscape, trending on artstation"
103+
104+
with autocast("cuda"):
105+
images = pipe(prompt=prompt, init_image=init_image, strength=0.75, guidance_scale=7.5)["sample"]
106+
107+
images[0].save("fantasy_landscape.png")
108+
```
109+
You can also run this example on colab [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/patil-suraj/Notebooks/blob/master/image_2_image_using_diffusers.ipynb)
110+
111+
### In-painting using Stable Diffusion
112+
113+
The `StableDiffusionInpaintPipeline` lets you edit specific parts of an image by providing a mask and text prompt.
114+
115+
```python
116+
from io import BytesIO
117+
118+
from torch import autocast
119+
import requests
120+
import PIL
121+
122+
from diffusers import StableDiffusionInpaintPipeline
123+
124+
def download_image(url):
125+
response = requests.get(url)
126+
return PIL.Image.open(BytesIO(response.content)).convert("RGB")
127+
128+
img_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo.png"
129+
mask_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo_mask.png"
130+
131+
init_image = download_image(img_url).resize((512, 512))
132+
mask_image = download_image(mask_url).resize((512, 512))
133+
134+
device = "cuda"
135+
pipe = StableDiffusionInpaintPipeline.from_pretrained(
136+
"CompVis/stable-diffusion-v1-4",
137+
revision="fp16",
138+
torch_dtype=torch.float16,
139+
use_auth_token=True
140+
)
141+
pipe = pipe.to(device)
142+
143+
prompt = "a cat sitting on a bench"
144+
with autocast("cuda"):
145+
images = pipe(prompt=prompt, init_image=init_image, mask_image=mask_image, strength=0.75)["sample"]
146+
147+
images[0].save("cat_on_bench.png")
148+
```
149+
150+
### Tweak prompts reusing seeds and latents
151+
152+
You can generate your own latents to reproduce results, or tweak your prompt on a specific result you liked. [This notebook](stable-diffusion-seeds.ipynb) shows how to do it step by step. You can also run it in Google Colab [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/pcuenca/diffusers-examples/blob/main/notebooks/stable-diffusion-seeds.ipynb).
153+
154+
69155
For more details, check out [the Stable Diffusion notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/stable_diffusion.ipynb) [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/stable_diffusion.ipynb)
70156
and have a look into the [release notes](https://github.com/huggingface/diffusers/releases/tag/v0.2.0).
71157

examples/README.md

Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
<!---
2+
Copyright 2022 The HuggingFace Team. All rights reserved.
3+
Licensed under the Apache License, Version 2.0 (the "License");
4+
you may not use this file except in compliance with the License.
5+
You may obtain a copy of the License at
6+
7+
http://www.apache.org/licenses/LICENSE-2.0
8+
9+
Unless required by applicable law or agreed to in writing, software
10+
distributed under the License is distributed on an "AS IS" BASIS,
11+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
See the License for the specific language governing permissions and
13+
limitations under the License.
14+
-->
15+
16+
# 🧨 Diffusers Examples
17+
18+
Diffusers examples are a collection of scripts to demonstrate how to effectively use the `diffusers` library
19+
for a variety of use cases.
20+
21+
**Note**: If you are looking for **official** examples on how to use `diffusers` for inference,
22+
please have a look at [src/diffusers/pipelines](https://github.com/huggingface/diffusers/tree/main/src/diffusers/pipelines)
23+
24+
Our examples aspire to be **self-contained**, **easy-to-tweak**, **beginner-friendly** and for **one-purpose-only**.
25+
More specifically, this means:
26+
27+
- **Self-contained**: An example script shall only depend on "pip-install-able" Python packages that can be found in a `requirements.txt` file. Example scripts shall **not** depend on any local files. This means that one can simply download an example script, *e.g.* [train_unconditional.py](https://github.com/huggingface/diffusers/blob/main/examples/unconditional_image_generation/train_unconditional.py), install the required dependencies, *e.g.* [requirements.txt](https://github.com/huggingface/diffusers/blob/main/examples/unconditional_image_generation/requirements.txt) and execute the example script.
28+
- **Easy-to-tweak**: While we strive to present as many use cases as possible, the example scripts are just that - examples. It is expected that they won't work out-of-the box on your specific problem and that you will be required to change a few lines of code to adapt them to your needs. To help you with that, most of the examples fully expose the preprocessing of the data and the training loop to allow you to tweak and edit them as required.
29+
- **Beginner-friendly**: We do not aim for providing state-of-the-art training scripts for the newest models, but rather examples that can be used as a way to better understand diffusion models and how to use them with the `diffusers` library. We often purposefully leave out certain state-of-the-art methods if we consider them too complex for beginners.
30+
- **One-purpose-only**: Examples should show one task and one task only. Even if a task is from a modeling
31+
point of view very similar, *e.g.* image super-resolution and image modification tend to use the same model and training method, we want examples to showcase only one task to keep them as readable and easy-to-understand as possible.
32+
33+
We provide **official** examples that cover the most popular tasks of diffusion models.
34+
*Official* examples are **actively** maintained by the `diffusers` maintainers and we try to rigorously follow our example philosophy as defined above.
35+
If you feel like another important example should exist, we are more than happy to welcome a [Feature Request](https://github.com/huggingface/diffusers/issues/new?assignees=&labels=&template=feature_request.md&title=) or directly a [Pull Request](https://github.com/huggingface/diffusers/compare) from you!
36+
37+
Training examples show how to pretrain or fine-tune diffusion models for a variety of tasks. Currently we support:
38+
39+
| Task | 🤗 Accelerate | 🤗 Datasets | Colab
40+
|---|---|:---:|:---:|
41+
| [**Unconditional Image Generation**](https://github.com/huggingface/transformers/tree/main/examples/training/train_unconditional.py) | ✅ | ✅ | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/training_example.ipynb)
42+
43+
## Community
44+
45+
In addition, we provide **community** examples, which are examples added and maintained by our community.
46+
Community examples can consist of both *training* examples or *inference* pipelines.
47+
For such examples, we are more lenient regarding the philosophy defined above and also cannot guarantee to provide maintenance for every issue.
48+
Examples that are useful for the community, but are either not yet deemed popular or not yet following our above philosophy should go into the [community examples](https://github.com/huggingface/diffusers/tree/main/examples/community) folder. The community folder therefore includes training examples and inference pipelines.
49+
**Note**: Community examples can be a [great first contribution](https://github.com/huggingface/diffusers/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22) to show to the community how you like to use `diffusers` 🪄.
50+
51+
## Important note
52+
53+
To make sure you can successfully run the latest versions of the example scripts, you have to **install the library from source** and install some example-specific requirements. To do this, execute the following steps in a new virtual environment:
54+
```bash
55+
git clone https://github.com/huggingface/diffusers
56+
cd diffusers
57+
pip install .
58+
```
59+
Then cd in the example folder of your choice and run
60+
```bash
61+
pip install -r requirements.txt
62+
```

examples/community/README.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
# Community Examples
2+
3+
**Community** examples consist of both inference and training examples that have been added by the community.
4+
5+
| Example | Description | Author | |
6+
|:----------|:-------------|:-------------|------:|

examples/inference/README.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
# Inference Examples
2+
3+
**The inference examples folder is deprecated and will be removed in a future version**.
4+
**Officially supported inference examples can be found in the [Pipelines folder](https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines)**.
5+
6+
- For `Image-to-Image text-guided generation with Stable Diffusion`, please have a look at the official [Pipeline examples](https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines#examples)
7+
- For `In-painting using Stable Diffusion`, please have a look at the official [Pipeline examples](https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines#examples)
8+
- For `Tweak prompts reusing seeds and latents`, please have a look at the official [Pipeline examples](https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines#examples)

0 commit comments

Comments
 (0)