Skip to content

Commit e6faf60

Browse files
authored
add: entry for DDPO support. (huggingface#5250)
* add: entry for DDPO support. * move to training * address steven's comments./
1 parent d8d8b2a commit e6faf60

File tree

2 files changed

+19
-0
lines changed

2 files changed

+19
-0
lines changed

docs/source/en/_toctree.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -106,6 +106,8 @@
106106
title: Custom Diffusion
107107
- local: training/t2i_adapters
108108
title: T2I-Adapters
109+
- local: training/ddpo
110+
title: Reinforcement learning training with DDPO
109111
title: Training
110112
- sections:
111113
- local: using-diffusers/other-modalities

docs/source/en/training/ddpo.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
2+
3+
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
4+
the License. You may obtain a copy of the License at
5+
6+
http://www.apache.org/licenses/LICENSE-2.0
7+
8+
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
9+
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10+
specific language governing permissions and limitations under the License.
11+
-->
12+
13+
# Reinforcement learning training with DDPO
14+
15+
You can fine-tune Stable Diffusion on a reward function via reinforcement learning with the 🤗 TRL library and 🤗 Diffusers. This is done with the Denoising Diffusion Policy Optimization (DDPO) algorithm introduced by Black et al. in [Training Diffusion Models with Reinforcement Learning](https://arxiv.org/abs/2305.13301), which is implemented in 🤗 TRL with the [`~trl.DDPOTrainer`].
16+
17+
For more information, check out the [`~trl.DDPOTrainer`] API reference and the [Finetune Stable Diffusion Models with DDPO via TRL](https://huggingface.co/blog/trl-ddpo) blog post.

0 commit comments

Comments
 (0)