RuntimeError: The size of tensor a (4608) must match the size of tensor b (5120) at non-singleton dimension 2 during DreamBooth Training with Prior Preservation

### Describe the bug

I am trying to run "train_dreambooth_lora_flux.py" on my dataset, but the error will happen if --with_prior_preservation is used. 


**Who can help me? Thanks!**

### Reproduction

python ./examples/dreambooth/train_dreambooth_lora_flux.py \
  --pretrained_model_name_or_path=$MODEL_NAME  \
  --instance_data_dir=$INSTANCE_DIR \
  --output_dir=$OUTPUT_DIR \
  --with_prior_preservation \
  --class_data_dir="my_file" \
  --class_prompt="A photo" \
  --instance_prompt="A sks photo" \
  --resolution=1024 \
  --rank=32 \
  --max_train_steps=5000 \
  --checkpointing_steps=100 \
  --seed="0" \
  --mixed_precision="bf16" \
  --train_batch_size=1 \
  --guidance_scale=1 \
  --gradient_accumulation_steps=4 \
  --optimizer="prodigy" \
  --learning_rate=1. \
  --report_to="tensorboard" \
  --lr_scheduler="constant" \
  --lr_warmup_steps=0

### Logs

```shell
Traceback (most recent call last):
  File "/data4/work/yinguowei/code/diffusers/./examples/dreambooth/train_dreambooth_lora_flux.py", line 1926, in <module>
    main(args)
  File "/data4/work/yinguowei/code/diffusers/./examples/dreambooth/train_dreambooth_lora_flux.py", line 1720, in main
    model_pred = transformer(
  File "/data/miniconda3/envs/diffusers_ygw/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/data/miniconda3/envs/diffusers_ygw/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
  File "/data/miniconda3/envs/diffusers_ygw/lib/python3.10/site-packages/accelerate/utils/operations.py", line 819, in forward
    return model_forward(*args, **kwargs)
  File "/data/miniconda3/envs/diffusers_ygw/lib/python3.10/site-packages/accelerate/utils/operations.py", line 807, in __call__
    return convert_to_fp32(self.model_forward(*args, **kwargs))
  File "/data/miniconda3/envs/diffusers_ygw/lib/python3.10/site-packages/torch/amp/autocast_mode.py", line 44, in decorate_autocast
    return func(*args, **kwargs)
  File "/data4/work/yinguowei/code/diffusers/src/diffusers/models/transformers/transformer_flux.py", line 529, in forward
    encoder_hidden_states, hidden_states = block(
  File "/data/miniconda3/envs/diffusers_ygw/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/data/miniconda3/envs/diffusers_ygw/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
  File "/data4/work/yinguowei/code/diffusers/src/diffusers/models/transformers/transformer_flux.py", line 188, in forward
    attention_outputs = self.attn(
  File "/data/miniconda3/envs/diffusers_ygw/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/data/miniconda3/envs/diffusers_ygw/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
  File "/data4/work/yinguowei/code/diffusers/src/diffusers/models/attention_processor.py", line 595, in forward
    return self.processor(
  File "/data4/work/yinguowei/code/diffusers/src/diffusers/models/attention_processor.py", line 2325, in __call__
    query = apply_rotary_emb(query, image_rotary_emb)
  File "/data4/work/yinguowei/code/diffusers/src/diffusers/models/embeddings.py", line 1204, in apply_rotary_emb
    out = (x.float() * cos + x_rotated.float() * sin).to(x.dtype)
RuntimeError: The size of tensor a (4608) must match the size of tensor b (5120) at non-singleton dimension 2
```

### System Info

- 🤗 Diffusers version: 0.33.0.dev0
- Platform: Linux-5.4.119-19.0009.44-x86_64-with-glibc2.28
- Running on Google Colab?: No
- Python version: 3.10.16
- PyTorch version (GPU?): 2.5.1+cu124 (True)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Huggingface_hub version: 0.27.1
- Transformers version: 4.48.1
- Accelerate version: 1.3.0
- PEFT version: 0.14.0
- Bitsandbytes version: not installed
- Safetensors version: 0.5.2
- xFormers version: not installed
- Accelerator: NVIDIA A800-SXM4-80GB, 81920 MiB
NVIDIA A800-SXM4-80GB, 81920 MiB
NVIDIA A800-SXM4-80GB, 81920 MiB
NVIDIA A800-SXM4-80GB, 81920 MiB
NVIDIA A800-SXM4-80GB, 81920 MiB
NVIDIA A800-SXM4-80GB, 81920 MiB
NVIDIA A800-SXM4-80GB, 81920 MiB
NVIDIA A800-SXM4-80GB, 81920 MiB
- Using GPU in script?: <fill in>
- Using distributed or parallel set-up in script?: <fill in>

### Who can help?

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RuntimeError: The size of tensor a (4608) must match the size of tensor b (5120) at non-singleton dimension 2 during DreamBooth Training with Prior Preservation #10722

Describe the bug

Reproduction

Logs

System Info

Who can help?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

RuntimeError: The size of tensor a (4608) must match the size of tensor b (5120) at non-singleton dimension 2 during DreamBooth Training with Prior Preservation #10722

Description

Describe the bug

Reproduction

Logs

System Info

Who can help?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions