Skip to content

Commit 95ea538

Browse files
yiyixuxuyiyixuxu
andauthored
Add ddpm kandinsky (huggingface#3783)
* update doc --------- Co-authored-by: yiyixuxu <yixu310@gmail,com>
1 parent ef3844d commit 95ea538

File tree

2 files changed

+17
-6
lines changed

2 files changed

+17
-6
lines changed

docs/source/en/api/pipelines/kandinsky.mdx

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,20 @@ t2i_pipe = DiffusionPipeline.from_pretrained("kandinsky-community/kandinsky-2-1"
5555
t2i_pipe.to("cuda")
5656
```
5757

58+
<Tip warning={true}>
59+
60+
By default, the text-to-image pipeline use [`DDIMScheduler`], you can change the scheduler to [`DDPMScheduler`]
61+
62+
```py
63+
scheduler = DDPMScheduler.from_pretrained("kandinsky-community/kandinsky-2-1", subfolder="ddpm_scheduler")
64+
t2i_pipe = DiffusionPipeline.from_pretrained(
65+
"kandinsky-community/kandinsky-2-1", scheduler=scheduler, torch_dtype=torch.float16
66+
)
67+
t2i_pipe.to("cuda")
68+
```
69+
70+
</Tip>
71+
5872
Now we pass the prompt through the prior to generate image embeddings. The prior
5973
returns both the image embeddings corresponding to the prompt and negative/unconditional image
6074
embeddings corresponding to an empty string.

src/diffusers/pipelines/kandinsky/pipeline_kandinsky.py

Lines changed: 3 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@
2222
from ...models import UNet2DConditionModel, VQModel
2323
from ...pipelines import DiffusionPipeline
2424
from ...pipelines.pipeline_utils import ImagePipelineOutput
25-
from ...schedulers import DDIMScheduler
25+
from ...schedulers import DDIMScheduler, DDPMScheduler
2626
from ...utils import (
2727
is_accelerate_available,
2828
is_accelerate_version,
@@ -88,7 +88,7 @@ class KandinskyPipeline(DiffusionPipeline):
8888
Frozen text-encoder.
8989
tokenizer ([`XLMRobertaTokenizer`]):
9090
Tokenizer of class
91-
scheduler ([`DDIMScheduler`]):
91+
scheduler (Union[`DDIMScheduler`,`DDPMScheduler`]):
9292
A scheduler to be used in combination with `unet` to generate image latents.
9393
unet ([`UNet2DConditionModel`]):
9494
Conditional U-Net architecture to denoise the image embedding.
@@ -101,7 +101,7 @@ def __init__(
101101
text_encoder: MultilingualCLIP,
102102
tokenizer: XLMRobertaTokenizer,
103103
unet: UNet2DConditionModel,
104-
scheduler: DDIMScheduler,
104+
scheduler: Union[DDIMScheduler, DDPMScheduler],
105105
movq: VQModel,
106106
):
107107
super().__init__()
@@ -439,9 +439,6 @@ def __call__(
439439
noise_pred,
440440
t,
441441
latents,
442-
# YiYi notes: only reason this pipeline can't work with unclip scheduler is that can't pass down this argument
443-
# need to use DDPM scheduler instead
444-
# prev_timestep=prev_timestep,
445442
generator=generator,
446443
).prev_sample
447444
# post-processing

0 commit comments

Comments
 (0)