Skip to content

Merge changes #211

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 65 commits into from
May 12, 2025
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
65 commits
Select commit Hold shift + click to select a range
bd96a08
[train_dreambooth_lora.py] Set LANCZOS as default interpolation mode …
merterbak Apr 26, 2025
aa5f5d4
[tests] add tests to check for graph breaks, recompilation, cuda sync…
sayakpaul Apr 28, 2025
9ce89e2
enable group_offload cases and quanto cases on XPU (#11405)
yao-matrix Apr 28, 2025
a7e9f85
enable test_layerwise_casting_memory cases on XPU (#11406)
yao-matrix Apr 28, 2025
0e3f271
[tests] fix import. (#11434)
sayakpaul Apr 28, 2025
b3b04fe
[train_text_to_image] Better image interpolation in training scripts …
tongyu0924 Apr 28, 2025
3da98e7
[train_text_to_image_lora] Better image interpolation in training scr…
tongyu0924 Apr 28, 2025
7567adf
enable 28 GGUF test cases on XPU (#11404)
yao-matrix Apr 28, 2025
0ac1d5b
[Hi-Dream LoRA] fix bug in validation (#11439)
linoytsaban Apr 28, 2025
4a9ab65
Fixing missing provider options argument (#11397)
urpetkov-amd Apr 28, 2025
58431f1
Set LANCZOS as the default interpolation for image resizing in Contro…
YoulunPeng Apr 29, 2025
8fe5a14
Raise warning instead of error for block offloading with streams (#11…
a-r-r-o-w Apr 30, 2025
60892c5
enable marigold_intrinsics cases on XPU (#11445)
yao-matrix Apr 30, 2025
c865115
`torch.compile` fullgraph compatibility for Hunyuan Video (#11457)
a-r-r-o-w Apr 30, 2025
fbe2fe5
enable consistency test cases on XPU, all passed (#11446)
yao-matrix Apr 30, 2025
35fada4
enable unidiffuser test cases on xpu (#11444)
yao-matrix Apr 30, 2025
fbce7ae
Add generic support for Intel Gaudi accelerator (hpu device) (#11328)
dsocek Apr 30, 2025
8cd7426
Add StableDiffusion3InstructPix2PixPipeline (#11378)
xduzhangjiayu Apr 30, 2025
23c9802
make safe diffusion test cases pass on XPU and A100 (#11458)
yao-matrix Apr 30, 2025
38ced7e
[test_models_transformer_hunyuan_video] help us test torch.compile() …
tongyu0924 Apr 30, 2025
daf0a23
Add LANCZOS as default interplotation mode. (#11463)
Va16hav07 Apr 30, 2025
06beeca
make autoencoders. controlnet_flux and wan_transformer3d_single_file …
yao-matrix Apr 30, 2025
d70f8ee
[WAN] fix recompilation issues (#11475)
sayakpaul May 1, 2025
86294d3
Fix typos in docs and comments (#11416)
co63oc May 1, 2025
5dcdf4a
[tests] xfail recent pipeline tests for specific methods. (#11469)
sayakpaul May 1, 2025
d0c0239
cache packages_distributions (#11453)
vladmandic May 1, 2025
b848d47
[docs] Memory optims (#11385)
stevhliu May 1, 2025
e23705e
[docs] Adapters (#11331)
stevhliu May 2, 2025
ed6cf52
[train_dreambooth_lora_sdxl_advanced] Add LANCZOS as the default inte…
yuanjua May 2, 2025
ec3d582
[train_dreambooth_lora_flux_advanced] Add LANCZOS as the default inte…
ysurs May 2, 2025
a674914
enable semantic diffusion and stable diffusion panorama cases on XPU …
yao-matrix May 5, 2025
8520d49
[Feature] Implement tiled VAE encoding/decoding for Wan model. (#11414)
c8ef May 5, 2025
fc5e906
[train_text_to_image_sdxl]Add LANCZOS as default interpolation mode f…
ParagEkbote May 5, 2025
ec93239
[train_dreambooth_lora_sdxl] Add --image_interpolation_mode option fo…
MinJu-Ha May 5, 2025
ee1516e
[train_dreambooth_lora_lumina2] Add LANCZOS as the default interpolat…
cjfghk5697 May 5, 2025
071807c
[training] feat: enable quantization for hidream lora training. (#11494)
sayakpaul May 5, 2025
9c29e93
Set LANCZOS as the default interpolation method for image resizing. (…
yijun-lee May 5, 2025
ed4efbd
Update training script for txt to img sdxl with lora supp with new in…
RogerSinghChugh May 5, 2025
1fa5639
Fix torchao docs typo for fp8 granular quantization (#11473)
a-r-r-o-w May 6, 2025
53f1043
Update setup.py to pin min version of `peft` (#11502)
sayakpaul May 6, 2025
d88ae1f
update dep table. (#11504)
sayakpaul May 6, 2025
10bee52
[LoRA] use `removeprefix` to preserve sanity. (#11493)
sayakpaul May 6, 2025
d7ffe60
Hunyuan Video Framepack (#11428)
a-r-r-o-w May 6, 2025
8c661ea
enable lora cases on XPU (#11506)
yao-matrix May 6, 2025
7937166
[lora_conversion] Enhance key handling for OneTrainer components in L…
iamwavecut May 6, 2025
fb29132
[docs] minor updates to bitsandbytes docs. (#11509)
sayakpaul May 6, 2025
7b90494
Cosmos (#10660)
a-r-r-o-w May 7, 2025
53bd367
clean up the __Init__ for stable_diffusion (#11500)
yiyixuxu May 7, 2025
87e508f
fix audioldm
sayakpaul May 8, 2025
c5c34a4
Revert "fix audioldm"
sayakpaul May 8, 2025
66e50d4
[LoRA] make lora alpha and dropout configurable (#11467)
linoytsaban May 8, 2025
784db0e
Add cross attention type for Sana-Sprint training in diffusers. (#11514)
scxue May 8, 2025
6674a51
Conditionally import torchvision in Cosmos transformer (#11524)
a-r-r-o-w May 8, 2025
393aefc
[tests] fix audioldm2 for transformers main. (#11522)
sayakpaul May 8, 2025
599c887
feat: pipeline-level quantization config (#11130)
sayakpaul May 9, 2025
7acf834
[Tests] Enable more general testing for `torch.compile()` with LoRA h…
sayakpaul May 9, 2025
0c47c95
[LoRA] support non-diffusers hidream loras (#11532)
sayakpaul May 9, 2025
2d38089
enable 7 cases on XPU (#11503)
yao-matrix May 9, 2025
3c0a012
[LTXPipeline] Update latents dtype to match VAE dtype (#11533)
james-p-xu May 9, 2025
d6bf268
enable dit integration cases on xpu (#11523)
yao-matrix May 9, 2025
0ba1f76
enable print_env on xpu (#11507)
yao-matrix May 9, 2025
92fe689
Change Framepack transformer layer initialization order (#11535)
a-r-r-o-w May 9, 2025
01abfc8
[tests] add tests for framepack transformer model. (#11520)
sayakpaul May 11, 2025
e48f6ae
Hunyuan Video Framepack F1 (#11534)
a-r-r-o-w May 12, 2025
c372615
enable several pipeline integration tests on XPU (#11526)
yao-matrix May 12, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
enable semantic diffusion and stable diffusion panorama cases on XPU (h…
…uggingface#11459)

Signed-off-by: Yao Matrix <[email protected]>
  • Loading branch information
yao-matrix authored May 5, 2025
commit a674914fd5f45ef7bcec71061aa2fb315ceb3495
Original file line number Diff line number Diff line change
Expand Up @@ -25,11 +25,11 @@
from diffusers import AutoencoderKL, DDIMScheduler, LMSDiscreteScheduler, PNDMScheduler, UNet2DConditionModel
from diffusers.pipelines.semantic_stable_diffusion import SemanticStableDiffusionPipeline as StableDiffusionPipeline
from diffusers.utils.testing_utils import (
backend_empty_cache,
enable_full_determinism,
floats_tensor,
nightly,
require_accelerator,
require_torch_gpu,
require_torch_accelerator,
torch_device,
)

Expand All @@ -42,13 +42,13 @@ def setUp(self):
# clean up the VRAM before each test
super().setUp()
gc.collect()
torch.cuda.empty_cache()
backend_empty_cache(torch_device)

def tearDown(self):
# clean up the VRAM after each test
super().tearDown()
gc.collect()
torch.cuda.empty_cache()
backend_empty_cache(torch_device)

@property
def dummy_image(self):
Expand Down Expand Up @@ -238,7 +238,7 @@ def test_semantic_diffusion_no_safety_checker(self):
image = pipe("example prompt", num_inference_steps=2).images[0]
assert image is not None

@require_accelerator
@require_torch_accelerator
def test_semantic_diffusion_fp16(self):
"""Test that stable diffusion works with fp16"""
unet = self.dummy_cond_unet
Expand Down Expand Up @@ -272,22 +272,21 @@ def test_semantic_diffusion_fp16(self):


@nightly
@require_torch_gpu
@require_torch_accelerator
class SemanticDiffusionPipelineIntegrationTests(unittest.TestCase):
def setUp(self):
# clean up the VRAM before each test
super().setUp()
gc.collect()
torch.cuda.empty_cache()
backend_empty_cache(torch_device)

def tearDown(self):
# clean up the VRAM after each test
super().tearDown()
gc.collect()
torch.cuda.empty_cache()
backend_empty_cache(torch_device)

def test_positive_guidance(self):
torch_device = "cuda"
pipe = StableDiffusionPipeline.from_pretrained("stable-diffusion-v1-5/stable-diffusion-v1-5")
pipe = pipe.to(torch_device)
pipe.set_progress_bar_config(disable=None)
Expand Down Expand Up @@ -370,7 +369,6 @@ def test_positive_guidance(self):
assert np.abs(image_slice.flatten() - expected_slice).max() < 1e-2

def test_negative_guidance(self):
torch_device = "cuda"
pipe = StableDiffusionPipeline.from_pretrained("stable-diffusion-v1-5/stable-diffusion-v1-5")
pipe = pipe.to(torch_device)
pipe.set_progress_bar_config(disable=None)
Expand Down Expand Up @@ -453,7 +451,6 @@ def test_negative_guidance(self):
assert np.abs(image_slice.flatten() - expected_slice).max() < 1e-2

def test_multi_cond_guidance(self):
torch_device = "cuda"
pipe = StableDiffusionPipeline.from_pretrained("stable-diffusion-v1-5/stable-diffusion-v1-5")
pipe = pipe.to(torch_device)
pipe.set_progress_bar_config(disable=None)
Expand Down Expand Up @@ -536,7 +533,6 @@ def test_multi_cond_guidance(self):
assert np.abs(image_slice.flatten() - expected_slice).max() < 1e-2

def test_guidance_fp16(self):
torch_device = "cuda"
pipe = StableDiffusionPipeline.from_pretrained(
"stable-diffusion-v1-5/stable-diffusion-v1-5", torch_dtype=torch.float16
)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,17 @@
StableDiffusionPanoramaPipeline,
UNet2DConditionModel,
)
from diffusers.utils.testing_utils import enable_full_determinism, nightly, require_torch_gpu, skip_mps, torch_device
from diffusers.utils.testing_utils import (
backend_empty_cache,
backend_max_memory_allocated,
backend_reset_max_memory_allocated,
backend_reset_peak_memory_stats,
enable_full_determinism,
nightly,
require_torch_accelerator,
skip_mps,
torch_device,
)

from ..pipeline_params import TEXT_TO_IMAGE_BATCH_PARAMS, TEXT_TO_IMAGE_IMAGE_PARAMS, TEXT_TO_IMAGE_PARAMS
from ..test_pipelines_common import (
Expand Down Expand Up @@ -267,17 +277,17 @@ def test_encode_prompt_works_in_isolation(self):


@nightly
@require_torch_gpu
@require_torch_accelerator
class StableDiffusionPanoramaNightlyTests(unittest.TestCase):
def setUp(self):
super().setUp()
gc.collect()
torch.cuda.empty_cache()
backend_empty_cache(torch_device)

def tearDown(self):
super().tearDown()
gc.collect()
torch.cuda.empty_cache()
backend_empty_cache(torch_device)

def get_inputs(self, seed=0):
generator = torch.manual_seed(seed)
Expand Down Expand Up @@ -415,9 +425,9 @@ def callback_fn(step: int, timestep: int, latents: torch.Tensor) -> None:
assert number_of_steps == 3

def test_stable_diffusion_panorama_pipeline_with_sequential_cpu_offloading(self):
torch.cuda.empty_cache()
torch.cuda.reset_max_memory_allocated()
torch.cuda.reset_peak_memory_stats()
backend_empty_cache(torch_device)
backend_reset_max_memory_allocated(torch_device)
backend_reset_peak_memory_stats(torch_device)

model_ckpt = "stabilityai/stable-diffusion-2-base"
scheduler = DDIMScheduler.from_pretrained(model_ckpt, subfolder="scheduler")
Expand All @@ -429,6 +439,6 @@ def test_stable_diffusion_panorama_pipeline_with_sequential_cpu_offloading(self)
inputs = self.get_inputs()
_ = pipe(**inputs)

mem_bytes = torch.cuda.max_memory_allocated()
mem_bytes = backend_max_memory_allocated(torch_device)
# make sure that less than 5.2 GB is allocated
assert mem_bytes < 5.5 * 10**9