Add ddpm kandinsky #3783

yiyixuxu · 2023-06-14T04:22:46Z

This PR will refactor Kandinsky pipelines to work with both ddpm and ddim scheduler

@sayakpaul @patrickvonplaten
currently the scheduler config in Kandinsky repo is for ddim, is there anyway we can also save ddpm scheduler config some where? instead having to manually create the ddpm config when user wants swap it the scheduler?

from diffusers import KandinskyPipeline, KandinskyPriorPipeline, DDPMScheduler

import torch
import numpy as np

device = "cuda"

# # inputs
prompt= "red cat, 4k photo"

# # create prior 
pipe_prior = KandinskyPriorPipeline.from_pretrained("kandinsky-community/kandinsky-2-1-prior")
pipe_prior.to("cuda")

# use prior to generate image_emb based on our prompt
generator = torch.Generator(device=device).manual_seed(0)
out = pipe_prior(prompt, num_inference_steps=5, generator=generator, )
image_emb = out.image_embeds
zero_image_emb = out.negative_image_embeds


# create diffuser pipeline
model_dtype = torch.float16
pipe = KandinskyPipeline.from_pretrained("kandinsky-community/kandinsky-2-1", torch_dtype =model_dtype )
pipe.to(device)

ddpm_scheduler = DDPMScheduler(
  clip_sample= True,
  clip_sample_range=2.0,
  sample_max_value=None,
  num_train_timesteps= 1000,
  prediction_type= "epsilon",
  variance_type= "learned_range",
  thresholding= True,
  beta_schedule= "linear",
  beta_start= 0.00085,
  beta_end=0.012
)
pipe.scheduler = ddpm_scheduler

generator = torch.Generator(device="cuda").manual_seed(0)
out = pipe(
    prompt,
    image_embeds=image_emb,
    negative_image_embeds =zero_image_emb,
    height=768,
    width=768,
    num_inference_steps=100,
    generator=generator )

image = out.images[0]

HuggingFaceDocBuilderDev · 2023-06-14T04:30:25Z

The documentation is not available anymore as the PR was closed or merged.

sayakpaul · 2023-06-14T05:00:39Z

Why can't we do something like so?

pipeline.scheduler = DDPMScheduler.from_config(
    pipeline.scheduler.config, **scheduler_args
)

**scheduler_args could take care of any new args that might be needed here. A good example of using **scheduler_args is here:
https://github.com/huggingface/diffusers/blob/ce5504934ac484fca39a1a5434ecfae09eabdf41/examples/dreambooth/train_dreambooth_lora.py#LL1261C17-L1263C18

yiyixuxu · 2023-06-14T06:07:47Z

still not able to reproduce the doc examples since I think the initial noise construction is a little bit different now with the most recent UI change for negative prompts but like these new images too

yiyixuxu · 2023-06-14T06:13:14Z

@sayakpaul yeah but then user has to know the new args for scheduler, which is pretty much everything - the config for ddim and ddpm is really different for kandinsky

ddim

{
   "num_train_timesteps": 1000,
   "beta_schedule":  "linear",
   "beta_start": 0.00085,
   "beta_end":0.012,
    "clip_sample" : false,
    "set_alpha_to_one" : false,
    "steps_offset" : 1,
    "prediction_type" : "epsilon",
    "thresholding" : false
}

ddpm

DDPMScheduler(
  clip_sample= True,
  clip_sample_range=2.0,
  sample_max_value=None,
  num_train_timesteps= 1000,
  prediction_type= "epsilon",
  variance_type= "learned_range",
  thresholding= True,
  beta_schedule= "linear",
  beta_start= 0.00085,
  beta_end=0.012
)

sayakpaul · 2023-06-14T07:09:11Z

Hmm, thanks for explaining. But I don't think we can have two different scheduler configurations here as it deviates from how we maintain their storage on the Hub. So, either

we can add an entry to the doc showing how to initialize the DDPM scheduler properly

or

we can have a pipeline saved on the Hub but with the DDPM scheduler so that the user doesn't have to configure anything.

I would prefer to go with the first approach as that way user retains flexibility.

patrickvonplaten · 2023-06-15T15:11:00Z

src/diffusers/schedulers/scheduling_ddpm.py

@@ -367,9 +367,14 @@ def step(
            )

        # 3. Clip or threshold "predicted x_0"
+        if self.config.sample_max_value is None and self.config.clip_sample:


I don't really understand this change - what are we doing here exactly? Is the problem that thresholding is done after clamping in Kandinsky?

@patrickvonplaten

I just need to clamp it and then thresholding

Currently, it will only do thresholding OR clamping, never both together, so if you set clip_smple to be True, it will NOT clip it if threholding is also True. I checked the config for IF, bothclip_sample and thresholding are set to be True there, but only threholding will be applied https://huggingface.co/DeepFloyd/IF-I-XL-v1.0/blob/main/scheduler/scheduler_config.json

I'm not sure why we need this if ... else... logic , I think it's confusing. if we only want threholding, we should just set clip_sample to be False. Removing it now will break IF I think

The reason for why thresholding and clipping is done in an if, else statement is because it's more or less the same operation.

If you do thresholding your output has to be clipped to [-1, 1] as can be seen here:

diffusers/src/diffusers/schedulers/scheduling_ddim.py

Line 280 in 59aefe9

sample = torch.clamp(sample, -s, s) / s # "we threshold xt0 to the range [-s, s] and then divide by s"

So in the case of IF even if the self.config.clip_sample would be run afterward it would make no difference as the output is already scaled to be between -1 and 1. We could/should probably add a warning that running thresholding and clipping together doesn't make sense and should be avoided. For IF it would probably be best if we just set clip_sample=False or is this used for anything else ? cc @williamberman

@yiyixuxu for Kandinsky the current "working" scheduler config is:

DDPMScheduler( clip_sample= True, clip_sample_range=2.0, sample_max_value=None, num_train_timesteps= 1000, prediction_type= "epsilon", variance_type= "learned_range", thresholding= True, beta_schedule= "linear", beta_start= 0.00085, beta_end=0.012 )

with this PR. I'm fairly certain that the following config without any changes to scheduling_ddim.py is the same:

DDPMScheduler( clip_sample= False, sample_max_value=2.0, num_train_timesteps= 1000, prediction_type= "epsilon", variance_type= "learned_range", thresholding= True, beta_schedule= "linear", beta_start= 0.00085, beta_end=0.012 )

Could you try this out?

I think the current logic since clip sample is the default, we just prioritize thresholding if its also set. So we're ok with both being set but if so we just do the threshold clipping

And yeah it's an if/else because thresholding is a form of clipping so it doesn't make sense to do both

Guess overall it just doesn't make too much sense to set both no? Think a warning like "Your clip sample is useless and doesn't do anything" could make sense

Could you try this out?

will try and report back :)

yeah no issues with either

@patrickvonplaten the cat image looks same to me! will revert

is math the same, or just that difference too small that we can ignore? I understood clip sample and dynamic thresholding are trying to do the same thing, but when I looked at the formula, I thought clamping first would cause the threshold to be different and cause a difference in the result, no the case it seems?

Think math is the same here actually

patrickvonplaten · 2023-06-15T15:13:05Z

src/diffusers/pipelines/kandinsky/pipeline_kandinsky.py

@@ -390,8 +408,13 @@ def __call__(
        image_embeds = torch.cat([negative_image_embeds, image_embeds], dim=0).to(
            dtype=prompt_embeds.dtype, device=device
        )
+        if hasattr(self.scheduler, "custom_timesteps"):


Hmm I don't think this logic makes much sense thb. How are custom timesteps related to a different timestep spacing which is chosen here?

If we need a different spacing for Kandinsky, let's add this to the DDPMScheduler similarly to:

diffusers/src/diffusers/schedulers/scheduling_ddim.py

Line 306 in 77f9137

if self.config.timestep_spacing == "leading":

@patrickvonplaten
looked into this, I think kandinsky's method of calculating step ratio is just slightly different from what's already in DDPM, maybe we don't need to make any changes as long as it does not affect generation quality?

cc @williamberman because Kandinsky timesteps is is exactly same as unclip

kandinsky/ unclip scheduler

step_ratio = (self.scheduler.config.num_train_timesteps - 1) / (num_inference_steps - 1) timesteps = (np.arange(0, num_inference_steps) * step_ratio).round()[::-1].copy().astype(np.int64)

ddpm

step_ratio = self.config.num_train_timesteps // self.num_inference_steps timesteps = (np.arange(0, num_inference_steps) * step_ratio).round()[::-1].copy().astype(np.int64)

compared the cheeseburger monster, the one above is generated with Kandinsky timesteps; the one below is generated with ddpm scheduler without any change. I can see the eyes and mouth are a little bit different for the monster on the right but it really doesn't affect quality

only super minor difference here too

patrickvonplaten · 2023-06-15T15:22:41Z

Agree that the DDPM scheduler works better, but also agree that we should do anything backwards breaking.

Given that the schedulers seem to be completely different to each other, I think we should just upload the DDPM scheduler to a new folder that can then be downloaded by the user easily.

Uploaded it here: https://huggingface.co/kandinsky-community/kandinsky-2-1/tree/main/ddpm_scheduler

@yiyixuxu we could just add the following line to the docs:

from diffusers import DDPMScheduler

scheduler = DDPMScheduler.from_pretrained("kandinsky-community/kandinsky-2-1", subfolder="ddpm_scheduler")
pipe = DiffusionPipeline.from_pretrained(..., scheduler=scheduler)

patrickvonplaten · 2023-06-15T15:23:08Z

Also the design of the this PR needs to be adapted tough. I think this PR introduces complex/weird design choices

patrickvonplaten · 2023-06-20T17:40:35Z

Let's try to get this one in soon @yiyixuxu

yiyixuxu · 2023-06-21T09:40:22Z

@patrickvonplaten ready for another review

patrickvonplaten

Cool that works!

* update doc --------- Co-authored-by: yiyixuxu <yixu310@gmail,com>

yiyixuxu added 2 commits June 14, 2023 03:42

resolve conflict for pipeline_kandinsky.py

cb61093

make style

b3520ee

patrickvonplaten reviewed Jun 15, 2023

View reviewed changes

yiyixuxu added 4 commits June 21, 2023 04:25

revert change to ddpm

607dae7

remove custom timesteps

1b696e3

remove custom timesteps

fcb2da3

update doc

c391948

patrickvonplaten approved these changes Jun 21, 2023

View reviewed changes

yiyixuxu merged commit 95ea538 into main Jun 21, 2023

patrickvonplaten deleted the add-ddpm-kandinsky branch June 28, 2023 11:10

yoonseokjin pushed a commit to yoonseokjin/diffusers that referenced this pull request Dec 25, 2023

Add ddpm kandinsky (huggingface#3783)

eeefc1e

* update doc --------- Co-authored-by: yiyixuxu <yixu310@gmail,com>

AmericanPresidentJimmyCarter pushed a commit to AmericanPresidentJimmyCarter/diffusers that referenced this pull request Apr 26, 2024

Add ddpm kandinsky (huggingface#3783)

333b0dd

* update doc --------- Co-authored-by: yiyixuxu <yixu310@gmail,com>

Add ddpm kandinsky #3783

Add ddpm kandinsky #3783

Uh oh!

Conversation

yiyixuxu commented Jun 14, 2023

Uh oh!

HuggingFaceDocBuilderDev commented Jun 14, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sayakpaul commented Jun 14, 2023

Uh oh!

yiyixuxu commented Jun 14, 2023

Uh oh!

yiyixuxu commented Jun 14, 2023

Uh oh!

sayakpaul commented Jun 14, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

patrickvonplaten Jun 15, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

patrickvonplaten commented Jun 15, 2023

Uh oh!

patrickvonplaten commented Jun 15, 2023

Uh oh!

patrickvonplaten commented Jun 20, 2023

Uh oh!

yiyixuxu commented Jun 21, 2023

Uh oh!

patrickvonplaten left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Jun 14, 2023 •

edited

Loading

patrickvonplaten Jun 15, 2023 •

edited

Loading