-
Notifications
You must be signed in to change notification settings - Fork 6.1k
Add SDXL long weighted prompt pipeline (replace pr:4629) #4661
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add SDXL long weighted prompt pipeline (replace pr:4629) #4661
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
Thanks for this. I think you forgot to add the example in the README. Once that's done I think we should be ready merge and ship 🚀 |
…r/diffusers into sdxl_long_weighted_prompt
sample added. thanks @sayakpaul |
I saw a PR on our documentation-images repo from you. I just merged it. Do you want to include a link to that sample somewhere in the README? |
Great, let me add the image to the Readme doc. give me a second :) |
…ument, add result image
image added to the readme doc |
Thanks for your valuable contribution! |
This is great, planning on integrating but I'm slightly confused on a couple things. In the example, what's going on with prompt = "text"*20 then combining it with prompt2 = "text continuation"*20 to pass to the pipeline? Why are we multiplying, and how would I treat a normal positive prompt whether it's short or long? |
In terms of img2img and inpainting, thinking provide a embedding function, so that long weighted prompted can be used for any SDXL based models. Will look into the internal of refiner and see how it works, and may build one for the refiner |
Ah, that makes more sense, was just a little confusing, maybe should replace example with actual long prompt with multiple syntaxes including ((double positive)), [negatives] and such.. |
hi @xhinker , thank for great contribution, but i got error when using your example, `Downloading (…)ain/model_index.json: 100%
|
@sayakpaul do you have any idea what causes this error? |
Is there a reproducible Colab Notebook? Could you reproduce it in a Colab Notebook? |
it worked if I add :
but it causes another error when i run
*update : you can check my colab here colab notebook |
you can walk around this issue by providing an empty neg prompt like this: from diffusers import DiffusionPipeline
import torch
pipe = DiffusionPipeline.from_pretrained(
"stabilityai/stable-diffusion-xl-base-1.0"
, torch_dtype = torch.float16
, use_safetensors = True
, variant = "fp16"
, custom_pipeline = "lpw_stable_diffusion_xl"
, custom_revision = "main"
).to("cuda")
prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"
neg_prompt = ""
image = pipe(prompt=prompt, negative_prompt = neg_prompt)
image[0][0] |
thanks you so much |
A new PR fixes this empty negative prompt error issue, now you don't need to provide a empty neg prompt with the newest code. |
It doesn't work when there is more than 1 sample, |
@adhikjoshi , will add it when I find time, thanks |
@xhinker thanks for the amazing contribution! 🙏 Is it possible to directly use the class rather than indirectly loading it via I am using some other pipeline mixins that I can only use if I directly reference the class. But basically should the following work? Sorry if this is more of a diffusers question. It wasn't obvious to me.
The reason I am asking is I see the following warning message still which seems unexpected? Though in the final image I see the effects of the part of the prompt that is supposedly being removed.
|
Yes, you can use the class directly of cause, just ignore the 77 token warning, it is sourced from the SDXL tokenizer. |
` code
` @xhinker |
While we can stack the embeddings to walkaround the 77 token limitation, but seems we can't apply the same strategy for the pooled embedding. You are right, for now, only the last segment's pooled embedding will be output. I was thinking on this problem before, but don't have a good idea to address it for now, would be happy to know if you have any suggestions. Thanks for reading through the code so carefully. |
I haven't found a suitable solution for now, and changing the shape of pooled_prompt_embeds would lead to errors in subsequent steps…… However, perhaps using the pooled_prompt_embeds from the first segment instead of the last one would be better because, in a prompt exceeding a length of 77, the content of the first segment is often more crucial. |
when using playground-v2.5 and this long weighted pipeline simultaneously, I will get an image with extreamly bad quality. |
…#4661) * Add SDXL long weighted prompt pipeline * Add SDXL long weighted prompt pipeline usage sample in the readme document * Add SDXL long weighted prompt pipeline usage sample in the readme document, add result image
What does this PR do?
replace PR:#4629
This PR added pipeline accepts unlimited size prompt and negative prompt string compatible with A1111 prompt weighting format.
Fixes: #4559
Before submitting
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
@sayakpaul I recreate a complete new fork and add the new code in this PR, hope no additional commits are coming along, will update the document and provide sample codes once this PR is done. Thanks