You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/source/en/using-diffusers/ip_adapter.md
+98-87Lines changed: 98 additions & 87 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -25,6 +25,9 @@ Let's take a look at how to use IP-Adapter's image prompting capabilities with t
25
25
26
26
In all the following examples, you'll see the [`~loaders.IPAdapterMixin.set_ip_adapter_scale`] method. This method controls the amount of text or image conditioning to apply to the model. A value of `1.0` means the model is only conditioned on the image prompt. Lowering this value encourages the model to produce more diverse images, but they may not be as aligned with the image prompt. Typically, a value of `0.5` achieves a good balance between the two prompt types and produces good results.
27
27
28
+
> [!TIP]
29
+
> In the examples below, try adding `low_cpu_mem_usage=True` to the [`~loaders.IPAdapterMixin.load_ip_adapter`] method to speed up the loading time.
There are a couple of IP-Adapter parameters that are useful to know about and can help you with your image generation tasks. These parameters can make your workflow more efficient or give you more control over image generation.
240
+
241
+
### Image embeddings
242
+
243
+
IP-Adapter enabled pipelines provide the `ip_adapter_image_embeds` parameter to accept precomputed image embeddings. This is particularly useful in scenarios where you need to run the IP-Adapter pipeline multiple times because you have more than one image. For example, [multi IP-Adapter](#multi-ip-adapter) is a specific use case where you provide multiple styling images to generate a specific image in a specific style. Loading and encoding multiple images each time you use the pipeline would be inefficient. Instead, you can precompute and save the image embeddings to disk (which can save a lot of space if you're using high-quality images) and load them when you need them.
244
+
234
245
> [!TIP]
235
-
> While calling `load_ip_adapter()`, pass `low_cpu_mem_usage=True` to speed up the loading time.
246
+
> This parameter also gives you the flexibility to load embeddings from other sources. For example, ComfyUI image embeddings for IP-Adapters are compatible with Diffusers and should work ouf-of-the-box!
247
+
248
+
Call the [`~StableDiffusionPipeline.prepare_ip_adapter_image_embeds`] method to encode and generate the image embeddings. Then you can save them to disk with `torch.save`.
236
249
237
-
All the pipelines supporting IP-Adapter accept a `ip_adapter_image_embeds` argument. If you need to run the IP-Adapter multiple times with the same image, you can encode the image once and save the embedding to the disk.
250
+
> [!TIP]
251
+
> If you're using IP-Adapter with `ip_adapter_image_embedding` instead of `ip_adapter_image`', you can set `load_ip_adapter(image_encoder_folder=None,...)` because you don't need to load an encoder to generate the image embeddings.
Load the image embedding and pass it to the pipeline as `ip_adapter_image_embeds`
252
-
253
-
> [!TIP]
254
-
> ComfyUI image embeddings for IP-Adapters are fully compatible in Diffusers and should work out-of-box.
265
+
Now load the image embeddings by passing them to the `ip_adapter_image_embeds` parameter.
255
266
256
267
```py
257
268
image_embeds = torch.load("image_embeds.ipadpt")
@@ -264,8 +275,86 @@ images = pipeline(
264
275
).images
265
276
```
266
277
267
-
> [!TIP]
268
-
> If you use IP-Adapter with `ip_adapter_image_embedding` instead of `ip_adapter_image`, you can choose not to load an image encoder by passing `image_encoder_folder=None` to `load_ip_adapter()`.
278
+
### IP-Adapter masking
279
+
280
+
Binary masks specify which portion of the output image should be assigned to an IP-Adapter. This is useful for composing more than one IP-Adapter image. For each input IP-Adapter image, you must provide a binary mask an an IP-Adapter.
281
+
282
+
To start, preprocess the input IP-Adapter images with the [`~image_processor.IPAdapterMaskProcessor.preprocess()`] to generate their masks. For optimal results, provide the output height and width to [`~image_processor.IPAdapterMaskProcessor.preprocess()`]. This ensures masks with different aspect ratios are appropriately stretched. If the input masks already match the aspect ratio of the generated image, you don't have to set the `height` and `width`.
283
+
284
+
```py
285
+
from diffusers.image_processor import IPAdapterMaskProcessor
When there is more than one input IP-Adapter image, load them as a list to ensure each image is assigned to a different IP-Adapter. Each of the input IP-Adapter images here correspond to the masks generated above.
@@ -279,6 +368,7 @@ Generating accurate faces is challenging because they are complex and nuanced. D
279
368
*[ip-adapter-plus-face_sd15.safetensors](https://huggingface.co/h94/IP-Adapter/blob/main/models/ip-adapter-plus-face_sd15.safetensors) uses patch embeddings and is conditioned with images of cropped faces
280
369
281
370
> [!TIP]
371
+
>
282
372
> [IP-Adapter-FaceID](https://huggingface.co/h94/IP-Adapter-FaceID) is a face-specific IP-Adapter trained with face ID embeddings instead of CLIP image embeddings, allowing you to generate more consistent faces in different contexts and styles. Try out this popular [community pipeline](https://github.com/huggingface/diffusers/tree/main/examples/community#ip-adapter-face-id) and see how it compares to the other face IP-Adapters.
283
373
284
374
For face models, use the [h94/IP-Adapter](https://huggingface.co/h94/IP-Adapter) checkpoint. It is also recommended to use [`DDIMScheduler`] or [`EulerDiscreteScheduler`] for face models.
Binary masks can be used to specify which portion of the output image should be assigned to an IP-Adapter.
509
-
For each input IP-Adapter image, a binary mask and an IP-Adapter must be provided.
510
-
511
-
Before passing the masks to the pipeline, it's essential to preprocess them using [`IPAdapterMaskProcessor.preprocess()`].
512
-
513
-
> [!TIP]
514
-
> For optimal results, provide the output height and width to [`IPAdapterMaskProcessor.preprocess()`]. This ensures that masks with differing aspect ratios are appropriately stretched. If the input masks already match the aspect ratio of the generated image, specifying height and width can be omitted.
515
-
516
-
Here an example with two masks:
517
-
518
-
```py
519
-
from diffusers.image_processor import IPAdapterMaskProcessor
Copy file name to clipboardExpand all lines: src/diffusers/loaders/ip_adapter.py
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -215,7 +215,7 @@ def load_ip_adapter(
215
215
else:
216
216
logger.warning(
217
217
"image_encoder is not loaded since `image_encoder_folder=None` passed. You will not be able to use `ip_adapter_image` when calling the pipeline with IP-Adapter."
218
-
"Use `ip_adapter_image_embedding` to pass pre-geneated image embedding instead."
218
+
"Use `ip_adapter_image_embeds` to pass pre-generated image embedding instead."
219
219
)
220
220
221
221
# create feature extractor if it has not been registered to the pipeline yet
0 commit comments