Support dynamically loading/unloading loras with group offloading #11804

a-r-r-o-w · 2025-06-25T07:49:38Z

reproducer

import torch
from diffusers import HunyuanVideoPipeline, HunyuanVideoTransformer3DModel
from diffusers.utils import export_to_video
from diffusers.hooks import apply_group_offloading
from diffusers.utils.logging import set_verbosity_debug

set_verbosity_debug()


model_id = "hunyuanvideo-community/HunyuanVideo"
transformer = HunyuanVideoTransformer3DModel.from_pretrained(
    model_id, subfolder="transformer", torch_dtype=torch.bfloat16
)
pipe = HunyuanVideoPipeline.from_pretrained(model_id, transformer=transformer, torch_dtype=torch.float16)
pipe.vae.enable_tiling()
onload_device = torch.device("cuda")
offload_device = torch.device("cpu")
pipe.load_lora_weights("a-r-r-o-w/HunyuanVideo-tuxemons")
num_inference_steps = 20

list(map(
    lambda x: apply_group_offloading(x, onload_device, offload_device, offload_type="leaf_level", use_stream=True),
    [pipe.transformer]
))
[module.to(onload_device) for module in (pipe.text_encoder, pipe.text_encoder_2, pipe.vae)]

output = pipe(
    prompt="Style of snomexut, a cat-like Tuxemon creature walks in alien-world grass, and observes its surroundings.",
    height=768,
    width=768,
    num_frames=33,
    num_inference_steps=num_inference_steps,
    generator=torch.Generator().manual_seed(73),
).frames[0]
export_to_video(output, "output.mp4", fps=15)

pipe.unload_lora_weights()

output_1 = pipe(
    prompt="Style of snomexut, a cat-like Tuxemon creature walks in alien-world grass, and observes its surroundings.",
    height=768,
    width=768,
    num_frames=33,
    num_inference_steps=num_inference_steps,
    generator=torch.Generator().manual_seed(73),
).frames[0]
export_to_video(output_1, "output2.mp4", fps=15)

pipe.load_lora_weights("a-r-r-o-w/HunyuanVideo-tuxemons")

output_2 = pipe(
    prompt="Style of snomexut, a cat-like Tuxemon creature walks in alien-world grass, and observes its surroundings.",
    height=768,
    width=768,
    num_frames=33,
    num_inference_steps=num_inference_steps,
    generator=torch.Generator().manual_seed(73),
).frames[0]
export_to_video(output_2, "output3.mp4", fps=15)

HuggingFaceDocBuilderDev · 2025-06-25T07:56:36Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

a-r-r-o-w · 2025-06-25T08:28:26Z

cc @zhangvia I think this should fix the issues you were facing. Could you test? Thanks 🤗

sayakpaul

Thanks! Just some minor comments.

And maybe we could include disabling and enabling group offloading in the existing _func_optionally_*() function. But not strong opinions.

src/diffusers/loaders/lora_base.py

tests/lora/utils.py

a-r-r-o-w · 2025-06-25T09:05:54Z

And maybe we could include disabling and enabling group offloading in the existing func_optionally*() function. But not strong opinions.

Yep, done!

sayakpaul

Fantastic!

sayakpaul · 2025-06-25T09:04:34Z

src/diffusers/hooks/group_offloading.py

@@ -268,9 +288,12 @@ class GroupOffloadingHook(ModelHook):

    _is_stateful = False

-    def __init__(self, group: ModuleGroup, next_group: Optional[ModuleGroup] = None) -> None:
+    def __init__(
+        self, group: ModuleGroup, next_group: Optional[ModuleGroup] = None, *, config: GroupOffloadingConfig


sayakpaul · 2025-06-25T09:05:38Z

src/diffusers/hooks/group_offloading.py

    else:
-        raise ValueError(f"Unsupported offload_type: {offload_type}")
+        assert False


Should this be turned into a sensible value error like previous? 👁️

The statement cannot be reached because we instantiate GroupOffloadingType with the value passed by user. Dataclass and enum will raise an error if the value is invalid

assert False is just a placeholder

zhangvia · 2025-06-26T02:13:22Z

cc @zhangvia I think this should fix the issues you were facing. Could you test? Thanks 🤗

Thanks for the quick fix！ i've Confirmed it.

DN6

Nice 👍🏽

a-r-r-o-w · 2025-06-27T14:43:33Z

A bit of a problem with the tests that weren't caught until now...

Group offloading with streams is limited in what it can do. If the same layer is invoked twice in the same parent layer's forward, the prefetching logic becomes completely incorrect. This is the case with:

CogView4 - invokes the same MLP on both hidden_states and encoder_hidden_states in succession
CogVideoX - invokes the same layernorm on both hidden_states and encoder_hidden_states in succession

One option to make it work is to concatenate inputs across sequence dim and then split again. This will, however, incur some extra perf cost because concatenation/split is not free.

Another option is creating a separate layer for each data stream and sharing the weights. I don't think this should incur memory overhead since the data reference will be same for both layers, but need to test to be sure.

Any other ideas are welcome. If these don't sound good, I propose we skip the tests for now and wait for group offloading logic to become more mature/stable. Other than than PR looks good to merge to me

sayakpaul · 2025-06-27T14:49:28Z

Thanks for investigating these and also for proposing the potential solutions.

On the surface, I would say we evaluate both approaches and then decide. However, the two models you mentioned probably have limited usage at least with group offloading for now. So, xfailing them is the more reasonable approach. WDYT?

a-r-r-o-w · 2025-06-27T14:56:05Z

I tested the first approach as it's a super simple change. The performance penalty is not noticeable end-to-end but only shows up at a small microsecond scale. I don't think it really matters because, like you mentioned, they probably have very limited usage in the context of group offloading.

For now, I'll specialize the tests for CogView4/CogVideoX by parameterizing with only non-stream tests instead of xfailing them, and add a note. Sounds good?

sayakpaul · 2025-06-27T14:58:04Z

For now, I'll specialize the tests for CogView4/CogVideoX by parameterizing with only non-stream tests instead of xfailing them, and add a note. Sounds good?

I am good!

a-r-r-o-w · 2025-06-27T15:33:55Z

Updated the tests. There are some gynamistics involved in skipping tests marked with parameterized because it seems like can't be overwritten or specialized in child classes

sayakpaul · 2025-06-27T15:36:31Z

tests/lora/test_lora_layers_cogvideox.py

+    @parameterized.expand([("block_level", True), ("leaf_level", False)])
+    @require_torch_accelerator
+    def test_group_offloading_inference_denoiser(self, offload_type, use_stream):
+        # TODO: We don't run the (leaf_level, True) test here that is enabled for other models.
+        # The reason for this can be found here: https://github.com/huggingface/diffusers/pull/11804#issuecomment-3013325338
+        super()._test_group_offloading_inference_denoiser(offload_type, use_stream)


We can do this or we could detect if the test class is either of CogView4 or CogVideoX and use pytest.skip(). Upto you.

Prefer the base test class not having any information about the model test classes that derive it. Current implementation will work for any model that overrides the test, so also much cleaner

…ggingface#11804) * update * add test * address review comments * update * fixes * change decorator order to fix tests * try fix * fight tests

update

c740127

a-r-r-o-w requested a review from DN6 June 25, 2025 08:01

add test

03335bf

sayakpaul reviewed Jun 25, 2025

View reviewed changes

src/diffusers/loaders/lora_base.py Show resolved Hide resolved

tests/lora/utils.py Outdated Show resolved Hide resolved

tests/lora/utils.py Outdated Show resolved Hide resolved

a-r-r-o-w added 2 commits June 25, 2025 10:48

address review comments

3897c57

Merge branch 'main' into support-dynamic-lora-with-go

9c0ae73

a-r-r-o-w requested a review from sayakpaul June 25, 2025 08:48

update

17e55b8

sayakpaul approved these changes Jun 25, 2025

View reviewed changes

fixes

ef8f7df

DN6 approved these changes Jun 26, 2025

View reviewed changes

a-r-r-o-w added 3 commits June 27, 2025 13:55

Merge branch 'main' into support-dynamic-lora-with-go

9b9fa85

change decorator order to fix tests

52d9c20

try fix

955a406

a-r-r-o-w added the roadmap Add to current release roadmap label Jun 27, 2025

github-project-automation bot added this to Diffusers Roadmap 0.35 Jun 27, 2025

github-project-automation bot moved this to In Progress in Diffusers Roadmap 0.35 Jun 27, 2025

a-r-r-o-w added 2 commits June 27, 2025 16:59

Merge branch 'main' into support-dynamic-lora-with-go

a0fff85

fight tests

06de952

sayakpaul reviewed Jun 27, 2025

View reviewed changes

a-r-r-o-w merged commit 76ec3d1 into main Jun 27, 2025
32 checks passed

github-project-automation bot moved this from In Progress to Done in Diffusers Roadmap 0.35 Jun 27, 2025

a-r-r-o-w deleted the support-dynamic-lora-with-go branch June 27, 2025 17:50

Support dynamically loading/unloading loras with group offloading #11804

Support dynamically loading/unloading loras with group offloading #11804

Uh oh!

Conversation

a-r-r-o-w commented Jun 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Jun 25, 2025

Uh oh!

a-r-r-o-w commented Jun 25, 2025

Uh oh!

sayakpaul left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

a-r-r-o-w commented Jun 25, 2025

Uh oh!

sayakpaul left a comment

Choose a reason for hiding this comment

Uh oh!

sayakpaul Jun 25, 2025

Choose a reason for hiding this comment

Uh oh!

sayakpaul Jun 25, 2025

Choose a reason for hiding this comment

Uh oh!

a-r-r-o-w Jun 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zhangvia commented Jun 26, 2025

Uh oh!

DN6 left a comment

Choose a reason for hiding this comment

Uh oh!

a-r-r-o-w commented Jun 27, 2025

Uh oh!

sayakpaul commented Jun 27, 2025

Uh oh!

a-r-r-o-w commented Jun 27, 2025

Uh oh!

sayakpaul commented Jun 27, 2025

Uh oh!

a-r-r-o-w commented Jun 27, 2025

Uh oh!

sayakpaul Jun 27, 2025

Choose a reason for hiding this comment

Uh oh!

a-r-r-o-w Jun 27, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

a-r-r-o-w commented Jun 25, 2025 •

edited

Loading

sayakpaul left a comment •

edited

Loading

a-r-r-o-w Jun 25, 2025 •

edited

Loading