Skip to content

Commit eb64e20

Browse files
[README] Add readme for SD (huggingface#274)
* [README] Add readme for SD * fix * fix * up * uP
1 parent a4d5b59 commit eb64e20

File tree

1 file changed

+90
-0
lines changed
  • src/diffusers/pipelines/stable_diffusion

1 file changed

+90
-0
lines changed
Lines changed: 90 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,90 @@
1+
# Stable Diffusion
2+
3+
## Overview
4+
5+
Stable Diffusion was proposed in [Stable Diffusion Announcement](https://stability.ai/blog/stable-diffusion-announcement) by Patrick Esser and Robin Rombach and the Stability AI team.
6+
7+
The summary of the model is the following:
8+
9+
*Stable Diffusion is a text-to-image model that will empower billions of people to create stunning art within seconds. It is a breakthrough in speed and quality meaning that it can run on consumer GPUs. You can see some of the amazing output that has been created by this model without pre or post-processing on this page. The model itself builds upon the work of the team at CompVis and Runway in their widely used latent diffusion model combined with insights from the conditional diffusion models by our lead generative AI developer Katherine Crowson, Dall-E 2 by Open AI, Imagen by Google Brain and many others. We are delighted that AI media generation is a cooperative field and hope it can continue this way to bring the gift of creativity to all.*
10+
11+
## Tips:
12+
13+
- Stable Diffusion has the same architecture as [Latent Diffusion](https://arxiv.org/abs/2112.10752) but uses a frozen CLIP Text Encoder instead of training the text encoder jointly with the diffusion model.
14+
- An in-detail explanation of the Stable Diffusion model can be found under [Stable Diffusion with 🧨 Diffusers](https://huggingface.co/blog/stable_diffusion).
15+
- Stable Diffusion can work with a variety of different samplers as is shown below.
16+
17+
## Available Pipelines:
18+
19+
| Pipeline | Tasks | Colab
20+
|---|---|:---:|
21+
| [pipeline_stable_diffusion.py](https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py) | *Text-to-Image Generation* | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/training_example.ipynb)
22+
| [pipeline_stable_diffusion_img2img](https://github.com/huggingface/diffusers/blob/e3238c0e4bd8f8ae23e8ac225b46af148ae11e40/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_inpaint.py) | *Image-to-Image Text-Guided Generation* | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/patil-suraj/Notebooks/blob/master/image_2_image_using_diffusers.ipynb)
23+
| [pipeline_stable_diffusion_inpaint](https://github.com/huggingface/diffusers/blob/e3238c0e4bd8f8ae23e8ac225b46af148ae11e40/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_inpaint.py) | *Text-Guided Image Inpainting* | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/patil-suraj/Notebooks/blob/master/in_painting_with_stable_diffusion_using_diffusers.ipynb)
24+
25+
## Examples:
26+
27+
### Text-to-Image with default PLMS scheduler
28+
29+
```python
30+
# make sure you're logged in with `huggingface-cli login`
31+
from torch import autocast
32+
from diffusers import StableDiffusionPipeline
33+
34+
pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", use_auth_token=True)
35+
pipe = pipe.to("cuda")
36+
37+
prompt = "a photo of an astronaut riding a horse on mars"
38+
with autocast("cuda"):
39+
image = pipe(prompt)["sample"][0]
40+
41+
image.save("astronaut_rides_horse.png")
42+
```
43+
44+
### Text-to-Image with DDIM scheduler
45+
46+
```python
47+
# make sure you're logged in with `huggingface-cli login`
48+
from torch import autocast
49+
from diffusers import StableDiffusionPipeline, DDIMScheduler
50+
51+
scheduler = DDIMScheduler(beta_start=0.00085, beta_end=0.012, beta_schedule="scaled_linear", clip_sample=False, set_alpha_to_one=False)
52+
53+
pipe = StableDiffusionPipeline.from_pretrained(
54+
"CompVis/stable-diffusion-v1-4",
55+
scheduler=scheduler,
56+
use_auth_token=True
57+
).to("cuda")
58+
59+
prompt = "a photo of an astronaut riding a horse on mars"
60+
with autocast("cuda"):
61+
image = pipe(prompt)["sample"][0]
62+
63+
image.save("astronaut_rides_horse.png")
64+
```
65+
66+
### Text-to-Image with K-LMS scheduler
67+
68+
```python
69+
# make sure you're logged in with `huggingface-cli login`
70+
from torch import autocast
71+
from diffusers import StableDiffusionPipeline, LMSDiscreteScheduler
72+
73+
lms = LMSDiscreteScheduler(
74+
beta_start=0.00085,
75+
beta_end=0.012,
76+
beta_schedule="scaled_linear"
77+
)
78+
79+
pipe = StableDiffusionPipeline.from_pretrained(
80+
"CompVis/stable-diffusion-v1-4",
81+
scheduler=lms,
82+
use_auth_token=True
83+
).to("cuda")
84+
85+
prompt = "a photo of an astronaut riding a horse on mars"
86+
with autocast("cuda"):
87+
image = pipe(prompt)["sample"][0]
88+
89+
image.save("astronaut_rides_horse.png")
90+
```

0 commit comments

Comments
 (0)