Skip to content

Commit 2e7a286

Browse files
yiyixuxuevinpinaryiyixuxupatil-surajpatrickvonplaten
authored
Attend and excite 2 (huggingface#2369)
* attend and excite pipeline * update update docstring example remove visualization remove the base class attention control remove dependency on stable diffusion pipeline always apply gaussian filter with default setting remove run_standard_sd argument hardcode attention_res and scale_range (related to step size) Update docs/source/en/api/pipelines/stable_diffusion/attend_and_excite.mdx Co-authored-by: Patrick von Platen <[email protected]> Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_attend_and_excite.py Co-authored-by: Patrick von Platen <[email protected]> Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_attend_and_excite.py Co-authored-by: Patrick von Platen <[email protected]> Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_attend_and_excite.py Co-authored-by: Patrick von Platen <[email protected]> Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_attend_and_excite.py Co-authored-by: Patrick von Platen <[email protected]> Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_attend_and_excite.py Co-authored-by: Patrick von Platen <[email protected]> Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_attend_and_excite.py Co-authored-by: Patrick von Platen <[email protected]> Update tests/pipelines/stable_diffusion_2/test_stable_diffusion_attend_and_excite.py Co-authored-by: Will Berman <[email protected]> Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_attend_and_excite.py Co-authored-by: Will Berman <[email protected]> Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_attend_and_excite.py Co-authored-by: Will Berman <[email protected]> revert test_float16_inference revert change to the batch related tests fix test_float16_inference handle batch remove the deprecation message remove None check, step_size remove debugging logging add slow test indices_to_alter -> indices add check_input * skip mps * style * Apply suggestions from code review Co-authored-by: Suraj Patil <[email protected]> Co-authored-by: Patrick von Platen <[email protected]> * indices -> token_indices --------- Co-authored-by: evin <[email protected]> Co-authored-by: yiyixuxu <yixu310@gmail,com> Co-authored-by: Suraj Patil <[email protected]> Co-authored-by: Patrick von Platen <[email protected]>
1 parent f243282 commit 2e7a286

File tree

11 files changed

+1290
-2
lines changed

11 files changed

+1290
-2
lines changed

docs/source/en/_toctree.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -151,6 +151,8 @@
151151
title: Stable-Diffusion-Latent-Upscaler
152152
- local: api/pipelines/stable_diffusion/pix2pix
153153
title: InstructPix2Pix
154+
- local: api/pipelines/stable_diffusion/attend_and_excite
155+
title: Attend and Excite
154156
- local: api/pipelines/stable_diffusion/pix2pix_zero
155157
title: Pix2Pix Zero
156158
- local: api/pipelines/stable_diffusion/self_attention_guidance
Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
2+
3+
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
4+
the License. You may obtain a copy of the License at
5+
6+
http://www.apache.org/licenses/LICENSE-2.0
7+
8+
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
9+
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10+
specific language governing permissions and limitations under the License.
11+
-->
12+
13+
# Attend and Excite: Attention-Based Semantic Guidance for Text-to-Image Diffusion Models
14+
15+
## Overview
16+
17+
Attend and Excite for Stable Diffusion was proposed in [Attend-and-Excite: Attention-Based Semantic Guidance for Text-to-Image Diffusion Models](https://attendandexcite.github.io/Attend-and-Excite/) and provides textual attention control over the image generation.
18+
19+
The abstract of the paper is the following:
20+
21+
*Text-to-image diffusion models have recently received a lot of interest for their astonishing ability to produce high-fidelity images from text only. However, achieving one-shot generation that aligns with the user's intent is nearly impossible, yet small changes to the input prompt often result in very different images. This leaves the user with little semantic control. To put the user in control, we show how to interact with the diffusion process to flexibly steer it along semantic directions. This semantic guidance (SEGA) allows for subtle and extensive edits, changes in composition and style, as well as optimizing the overall artistic conception. We demonstrate SEGA's effectiveness on a variety of tasks and provide evidence for its versatility and flexibility.*
22+
23+
Resources
24+
25+
* [Project Page](https://attendandexcite.github.io/Attend-and-Excite/)
26+
* [Paper](https://arxiv.org/abs/2301.13826)
27+
* [Original Code](https://github.com/AttendAndExcite/Attend-and-Excite)
28+
* [Demo](https://huggingface.co/spaces/AttendAndExcite/Attend-and-Excite)
29+
30+
31+
## Available Pipelines:
32+
33+
| Pipeline | Tasks | Colab | Demo
34+
|---|---|:---:|:---:|
35+
| [pipeline_semantic_stable_diffusion_attend_and_excite.py](https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/stable_diffusion/pipeline_semantic_stable_diffusion_attend_and_excite) | *Text-to-Image Generation* | - | -
36+
37+
38+
### Usage example
39+
40+
41+
```python
42+
import torch
43+
from diffusers import StableDiffusionAttendAndExcitePipeline
44+
45+
model_id = "CompVis/stable-diffusion-v1-4"
46+
pipe = StableDiffusionAttendAndExcitePipeline.from_pretrained(model_id, torch_dtype=torch.float16).to("cuda")
47+
pipe = pipe.to("cuda")
48+
49+
prompt = "a cat and a frog"
50+
51+
# use get_indices function to find out indices of the tokens you want to alter
52+
pipe.get_indices(prompt)
53+
54+
token_indices = [2, 5]
55+
seed = 6141
56+
generator = torch.Generator("cuda").manual_seed(seed)
57+
58+
images = pipe(
59+
prompt=prompt,
60+
token_indices=token_indices,
61+
guidance_scale=7.5,
62+
generator=generator,
63+
num_inference_steps=50,
64+
max_iter_to_alter=25,
65+
).images
66+
67+
image = images[0]
68+
image.save(f"../images/{prompt}_{seed}.png")
69+
```
70+
71+
72+
## StableDiffusionAttendAndExcitePipeline
73+
[[autodoc]] StableDiffusionAttendAndExcitePipeline
74+
- all
75+
- __call__

docs/source/en/api/pipelines/stable_diffusion/overview.mdx

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@ For more details about how Stable Diffusion works and how it differs from the ba
3333
| [StableDiffusionUpscalePipeline](./upscale) | **Experimental** *Text-Guided Image Super-Resolution * | | Coming soon
3434
| [StableDiffusionLatentUpscalePipeline](./latent_upscale) | **Experimental** *Text-Guided Image Super-Resolution * | | Coming soon
3535
| [StableDiffusionInstructPix2PixPipeline](./pix2pix) | **Experimental** *Text-Based Image Editing * | | [InstructPix2Pix: Learning to Follow Image Editing Instructions](https://huggingface.co/spaces/timbrooks/instruct-pix2pix)
36+
| [StableDiffusionAttendAndExcitePipeline](./attend_and_excite) | **Experimental** *Text-to-Image Generation * | | [Attend-and-Excite: Attention-Based Semantic Guidance for Text-to-Image Diffusion Models](https://huggingface.co/spaces/AttendAndExcite/Attend-and-Excite)
3637
| [StableDiffusionPix2PixZeroPipeline](./pix2pix_zero) | **Experimental** *Text-Based Image Editing * | | [Zero-shot Image-to-Image Translation](https://arxiv.org/abs/2302.03027)
3738

3839

docs/source/en/index.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,7 @@ available a colab notebook to directly try them out.
5050
| [stable_diffusion](./api/pipelines/stable_diffusion/text2img) | [**Stable Diffusion**](https://stability.ai/blog/stable-diffusion-public-release) | Text-to-Image Generation | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/training_example.ipynb)
5151
| [stable_diffusion](./api/pipelines/stable_diffusion/img2img) | [**Stable Diffusion**](https://stability.ai/blog/stable-diffusion-public-release) | Image-to-Image Text-Guided Generation | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/image_2_image_using_diffusers.ipynb)
5252
| [stable_diffusion](./api/pipelines/stable_diffusion/inpaint) | [**Stable Diffusion**](https://stability.ai/blog/stable-diffusion-public-release) | Text-Guided Image Inpainting | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/in_painting_with_stable_diffusion_using_diffusers.ipynb)
53-
| [stable_diffusion_2](./api/pipelines/stable_diffusion_2) | [**Stable Diffusion 2**](https://stability.ai/blog/stable-diffusion-v2-release) | Text-to-Image Generation |
53+
| [stable_diffusion_2](./api/pipelines/stable_diffusion_2) | [**Stable Diffusion 2**](https://stability.ai/blog/stable-diffusion-v2-release) | Text-to-Image Generation |
5454
| [stable_diffusion_2](./api/pipelines/stable_diffusion_2) | [**Stable Diffusion 2**](https://stability.ai/blog/stable-diffusion-v2-release) | Text-Guided Image Inpainting |
5555
| [stable_diffusion_2](./api/pipelines/stable_diffusion_2) | [**Stable Diffusion 2**](https://stability.ai/blog/stable-diffusion-v2-release) | Text-Guided Super Resolution Image-to-Image |
5656
| [stable_diffusion_safe](./api/pipelines/stable_diffusion_safe) | [**Safe Stable Diffusion**](https://arxiv.org/abs/2211.05105) | Text-Guided Generation | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ml-research/safe-latent-diffusion/blob/main/examples/Safe%20Latent%20Diffusion.ipynb)

src/diffusers/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -110,6 +110,7 @@
110110
CycleDiffusionPipeline,
111111
LDMTextToImagePipeline,
112112
PaintByExamplePipeline,
113+
StableDiffusionAttendAndExcitePipeline,
113114
StableDiffusionDepth2ImgPipeline,
114115
StableDiffusionImageVariationPipeline,
115116
StableDiffusionImg2ImgPipeline,

src/diffusers/pipelines/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,7 @@
4646
from .paint_by_example import PaintByExamplePipeline
4747
from .stable_diffusion import (
4848
CycleDiffusionPipeline,
49+
StableDiffusionAttendAndExcitePipeline,
4950
StableDiffusionDepth2ImgPipeline,
5051
StableDiffusionImageVariationPipeline,
5152
StableDiffusionImg2ImgPipeline,

src/diffusers/pipelines/stable_diffusion/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,7 @@ class StableDiffusionPipelineOutput(BaseOutput):
4444
else:
4545
from .pipeline_cycle_diffusion import CycleDiffusionPipeline
4646
from .pipeline_stable_diffusion import StableDiffusionPipeline
47+
from .pipeline_stable_diffusion_attend_and_excite import StableDiffusionAttendAndExcitePipeline
4748
from .pipeline_stable_diffusion_img2img import StableDiffusionImg2ImgPipeline
4849
from .pipeline_stable_diffusion_inpaint import StableDiffusionInpaintPipeline
4950
from .pipeline_stable_diffusion_inpaint_legacy import StableDiffusionInpaintPipelineLegacy

0 commit comments

Comments
 (0)