Skip to content

Fix initial noise standard deviation #263

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

kjslag
Copy link

@kjslag kjslag commented Dec 30, 2023

prepare_sampling_loop in diffusionmodules/sampling.py currently initializes the initial noise with standard deviation sqrt(1+sigma^2). However, according to the EDM paper [1], the initial noise standard deviation should just be sigma. (See line 2 of Algorithm 2 of "Elucidating the Design Space of Diffusion-Based Generative Models".) This pull request fixes the issue.

I tested the change using the code below:

import importlib
import numpy
import torch
import torchvision

import sgm
import sgm.inference.api as api

pipeline = api.SamplingPipeline(api.ModelArchitecture.SDXL_V1_BASE)

torch.manual_seed(1)
output = pipeline.text_to_image(
    params=api.SamplingParams(steps=10),
    prompt="A professional photograph of an astronaut riding a pig")
torchvision.utils.save_image(output[0], 'image.png')

Below are the results. Note that after the fix, there are several improvements:

  • an extra leg is removed behind the pig's front left leg
  • pig's right eye no longer unrealistically pops out of the pig's face
  • better pig nose
  • better pig tail
  • back of astronaut helmet doesn't look as strange
  • the patch on the right of the astronaut's shoulder is no longer blurry

This is the only test I bothered trying. More tests/checks may be warranted...

Before the fix:
bad

After the fix:
good

Per the EDM paper, the initial noise standard deviation should be sigma, not sqrt(1+sigma^2).
@kjslag kjslag closed this Dec 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant