Skip to content

Commit bae14c8

Browse files
authored
[docs] Update training docs (huggingface#5512)
* first draft * try hfoption syntax * fix hfoption id * add text2image * fix tag * feedback * feedbacks * add textual inversion * DreamBooth * lora * controlnet * instructpix2pix * custom diffusion * t2i * separate training methods and models * sdxl * kandinsky * wuerstchen * light edits
1 parent ded93f7 commit bae14c8

File tree

14 files changed

+2394
-1970
lines changed

14 files changed

+2394
-1970
lines changed

docs/source/en/_toctree.yml

Lines changed: 30 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -100,26 +100,36 @@
100100
title: Create a dataset for training
101101
- local: training/adapt_a_model
102102
title: Adapt a model to a new task
103-
- local: training/unconditional_training
104-
title: Unconditional image generation
105-
- local: training/text_inversion
106-
title: Textual Inversion
107-
- local: training/dreambooth
108-
title: DreamBooth
109-
- local: training/text2image
110-
title: Text-to-image
111-
- local: training/lora
112-
title: Low-Rank Adaptation of Large Language Models (LoRA)
113-
- local: training/controlnet
114-
title: ControlNet
115-
- local: training/instructpix2pix
116-
title: InstructPix2Pix Training
117-
- local: training/custom_diffusion
118-
title: Custom Diffusion
119-
- local: training/t2i_adapters
120-
title: T2I-Adapters
121-
- local: training/ddpo
122-
title: Reinforcement learning training with DDPO
103+
- sections:
104+
- local: training/unconditional_training
105+
title: Unconditional image generation
106+
- local: training/text2image
107+
title: Text-to-image
108+
- local: training/sdxl
109+
title: Stable Diffusion XL
110+
- local: training/kandinsky
111+
title: Kandinsky 2.2
112+
- local: training/wuerstchen
113+
title: Wuerstchen
114+
- local: training/controlnet
115+
title: ControlNet
116+
- local: training/t2i_adapters
117+
title: T2I-Adapters
118+
- local: training/instructpix2pix
119+
title: InstructPix2Pix
120+
title: Models
121+
- sections:
122+
- local: training/text_inversion
123+
title: Textual Inversion
124+
- local: training/dreambooth
125+
title: DreamBooth
126+
- local: training/lora
127+
title: LoRA
128+
- local: training/custom_diffusion
129+
title: Custom Diffusion
130+
- local: training/ddpo
131+
title: Reinforcement learning training with DDPO
132+
title: Methods
123133
title: Training
124134
- sections:
125135
- local: using-diffusers/other-modalities

docs/source/en/training/controlnet.md

Lines changed: 223 additions & 190 deletions
Large diffs are not rendered by default.

docs/source/en/training/custom_diffusion.md

Lines changed: 225 additions & 167 deletions
Large diffs are not rendered by default.

docs/source/en/training/dreambooth.md

Lines changed: 255 additions & 518 deletions
Large diffs are not rendered by default.

docs/source/en/training/instructpix2pix.md

Lines changed: 166 additions & 131 deletions
Large diffs are not rendered by default.

docs/source/en/training/kandinsky.md

Lines changed: 327 additions & 0 deletions
Large diffs are not rendered by default.

docs/source/en/training/lora.md

Lines changed: 119 additions & 482 deletions
Large diffs are not rendered by default.

docs/source/en/training/overview.md

Lines changed: 32 additions & 53 deletions
Original file line numberDiff line numberDiff line change
@@ -10,75 +10,54 @@ an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express o
1010
specific language governing permissions and limitations under the License.
1111
-->
1212

13-
# 🧨 Diffusers Training Examples
13+
# Overview
1414

15-
Diffusers training examples are a collection of scripts to demonstrate how to effectively use the `diffusers` library
16-
for a variety of use cases.
15+
🤗 Diffusers provides a collection of training scripts for you to train your own diffusion models. You can find all of our training scripts in [diffusers/examples](https://github.com/huggingface/diffusers/tree/main/examples).
1716

18-
**Note**: If you are looking for **official** examples on how to use `diffusers` for inference,
19-
please have a look at [src/diffusers/pipelines](https://github.com/huggingface/diffusers/tree/main/src/diffusers/pipelines)
17+
Each training script is:
2018

21-
Our examples aspire to be **self-contained**, **easy-to-tweak**, **beginner-friendly** and for **one-purpose-only**.
22-
More specifically, this means:
19+
- **Self-contained**: the training script does not depend on any local files, and all packages required to run the script are installed from the `requirements.txt` file.
20+
- **Easy-to-tweak**: the training scripts are an example of how to train a diffusion model for a specific task and won't work out-of-the-box for every training scenario. You'll likely need to adapt the training script for your specific use-case. To help you with that, we've fully exposed the data preprocessing code and the training loop so you can modify it for your own use.
21+
- **Beginner-friendly**: the training scripts are designed to be beginner-friendly and easy to understand, rather than including the latest state-of-the-art methods to get the best and most competitive results. Any training methods we consider too complex are purposefully left out.
22+
- **Single-purpose**: each training script is expressly designed for only one task to keep it readable and understandable.
2323

24-
- **Self-contained**: An example script shall only depend on "pip-install-able" Python packages that can be found in a `requirements.txt` file. Example scripts shall **not** depend on any local files. This means that one can simply download an example script, *e.g.* [train_unconditional.py](https://github.com/huggingface/diffusers/blob/main/examples/unconditional_image_generation/train_unconditional.py), install the required dependencies, *e.g.* [requirements.txt](https://github.com/huggingface/diffusers/blob/main/examples/unconditional_image_generation/requirements.txt) and execute the example script.
25-
- **Easy-to-tweak**: While we strive to present as many use cases as possible, the example scripts are just that - examples. It is expected that they won't work out-of-the box on your specific problem and that you will be required to change a few lines of code to adapt them to your needs. To help you with that, most of the examples fully expose the preprocessing of the data and the training loop to allow you to tweak and edit them as required.
26-
- **Beginner-friendly**: We do not aim for providing state-of-the-art training scripts for the newest models, but rather examples that can be used as a way to better understand diffusion models and how to use them with the `diffusers` library. We often purposefully leave out certain state-of-the-art methods if we consider them too complex for beginners.
27-
- **One-purpose-only**: Examples should show one task and one task only. Even if a task is from a modeling
28-
point of view very similar, *e.g.* image super-resolution and image modification tend to use the same model and training method, we want examples to showcase only one task to keep them as readable and easy-to-understand as possible.
24+
Our current collection of training scripts include:
2925

30-
We provide **official** examples that cover the most popular tasks of diffusion models.
31-
*Official* examples are **actively** maintained by the `diffusers` maintainers and we try to rigorously follow our example philosophy as defined above.
32-
If you feel like another important example should exist, we are more than happy to welcome a [Feature Request](https://github.com/huggingface/diffusers/issues/new?assignees=&labels=&template=feature_request.md&title=) or directly a [Pull Request](https://github.com/huggingface/diffusers/compare) from you!
26+
| Training | SDXL-support | LoRA-support | Flax-support |
27+
|---|---|---|---|
28+
| [unconditional image generation](https://github.com/huggingface/diffusers/tree/main/examples/unconditional_image_generation) [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/training_example.ipynb) | | | |
29+
| [text-to-image](https://github.com/huggingface/diffusers/tree/main/examples/text_to_image) | 👍 | 👍 | 👍 |
30+
| [textual inversion](https://github.com/huggingface/diffusers/tree/main/examples/textual_inversion) [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/sd_textual_inversion_training.ipynb) | | | 👍 |
31+
| [DreamBooth](https://github.com/huggingface/diffusers/tree/main/examples/dreambooth) [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/sd_dreambooth_training.ipynb) | 👍 | 👍 | 👍 |
32+
| [ControlNet](https://github.com/huggingface/diffusers/tree/main/examples/controlnet) | 👍 | | 👍 |
33+
| [InstructPix2Pix](https://github.com/huggingface/diffusers/tree/main/examples/instruct_pix2pix) | 👍 | | |
34+
| [Custom Diffusion](https://github.com/huggingface/diffusers/tree/main/examples/custom_diffusion) | | | |
35+
| [T2I-Adapters](https://github.com/huggingface/diffusers/tree/main/examples/t2i_adapter) | 👍 | | |
36+
| [Kandinsky 2.2](https://github.com/huggingface/diffusers/tree/main/examples/kandinsky2_2/text_to_image) | | 👍 | |
37+
| [Wuerstchen](https://github.com/huggingface/diffusers/tree/main/examples/wuerstchen/text_to_image) | | 👍 | |
3338

34-
Training examples show how to pretrain or fine-tune diffusion models for a variety of tasks. Currently we support:
39+
These examples are **actively** maintained, so please feel free to open an issue if they aren't working as expected. If you feel like another training example should be included, you're more than welcome to start a [Feature Request](https://github.com/huggingface/diffusers/issues/new?assignees=&labels=&template=feature_request.md&title=) to discuss your feature idea with us and whether it meets our criteria of being self-contained, easy-to-tweak, beginner-friendly, and single-purpose.
3540

36-
- [Unconditional Training](./unconditional_training)
37-
- [Text-to-Image Training](./text2image)<sup>*</sup>
38-
- [Text Inversion](./text_inversion)
39-
- [Dreambooth](./dreambooth)<sup>*</sup>
40-
- [LoRA Support](./lora)<sup>*</sup>
41-
- [ControlNet](./controlnet)<sup>*</sup>
42-
- [InstructPix2Pix](./instructpix2pix)<sup>*</sup>
43-
- [Custom Diffusion](./custom_diffusion)
44-
- [T2I-Adapters](./t2i_adapters)<sup>*</sup>
41+
## Install
4542

46-
<sup>*</sup>: Supports [Stable Diffusion XL](../api/pipelines/stable_diffusion/stable_diffusion_xl).
47-
48-
If possible, please [install xFormers](../optimization/xformers) for memory efficient attention. This could help make your training faster and less memory intensive.
49-
50-
| Task | 🤗 Accelerate | 🤗 Datasets | Colab
51-
|---|---|:---:|:---:|
52-
| [**Unconditional Image Generation**](./unconditional_training) | ✅ | ✅ | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/training_example.ipynb)
53-
| [**Text-to-Image fine-tuning**](./text2image) |||
54-
| [**Textual Inversion**](./text_inversion) | ✅ | - | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/sd_textual_inversion_training.ipynb)
55-
| [**Dreambooth**](./dreambooth) | ✅ | - | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/sd_dreambooth_training.ipynb)
56-
| [**Training with LoRA**](./lora) || - | - |
57-
| [**ControlNet**](./controlnet) ||| - |
58-
| [**InstructPix2Pix**](./instructpix2pix) ||| - |
59-
| [**Custom Diffusion**](./custom_diffusion) ||| - |
60-
| [**T2I Adapters**](./t2i_adapters) ||| - |
61-
62-
## Community
63-
64-
In addition, we provide **community** examples, which are examples added and maintained by our community.
65-
Community examples can consist of both *training* examples or *inference* pipelines.
66-
For such examples, we are more lenient regarding the philosophy defined above and also cannot guarantee to provide maintenance for every issue.
67-
Examples that are useful for the community, but are either not yet deemed popular or not yet following our above philosophy should go into the [community examples](https://github.com/huggingface/diffusers/tree/main/examples/community) folder. The community folder therefore includes training examples and inference pipelines.
68-
**Note**: Community examples can be a [great first contribution](https://github.com/huggingface/diffusers/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22) to show to the community how you like to use `diffusers` 🪄.
69-
70-
## Important note
71-
72-
To make sure you can successfully run the latest versions of the example scripts, you have to **install the library from source** and install some example-specific requirements. To do this, execute the following steps in a new virtual environment:
43+
Make sure you can successfully run the latest versions of the example scripts by installing the library from source in a new virtual environment:
7344

7445
```bash
7546
git clone https://github.com/huggingface/diffusers
7647
cd diffusers
7748
pip install .
7849
```
7950

80-
Then cd in the example folder of your choice and run
51+
Then navigate to the folder of the training script (for example, [DreamBooth](https://github.com/huggingface/diffusers/tree/main/examples/dreambooth)) and install the `requirements.txt` file. Some training scripts have a specific requirement file for SDXL, LoRA or Flax. If you're using one of these scripts, make sure you install its corresponding requirements file.
8152

8253
```bash
54+
cd examples/dreambooth
8355
pip install -r requirements.txt
56+
# to train SDXL with DreamBooth
57+
pip install -r requirements_sdxl.txt
8458
```
59+
60+
To speedup training and reduce memory-usage, we recommend:
61+
62+
- using PyTorch 2.0 or higher to automatically use [scaled dot product attention](../optimization/torch2.0#scaled-dot-product-attention) during training (you don't need to make any changes to the training code)
63+
- installing [xFormers](../optimization/xformers) to enable memory-efficient attention

0 commit comments

Comments
 (0)