Skip to content

update expected results of slow tests #268

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 51 commits into from
Sep 12, 2022
Merged

Conversation

kashif
Copy link
Contributor

@kashif kashif commented Aug 29, 2022

Updated the two failing slow tests

Copy link
Member

@anton-l anton-l left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fixes @kashif!

@patrickvonplaten IIRC you wanted to double-check something before updating the VE tests?

@kashif
Copy link
Contributor Author

kashif commented Aug 29, 2022

I can run the tests now on a GPU to double check as well... will report soon

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Aug 29, 2022

The documentation is not available anymore as the PR was closed or merged.

@anton-l
Copy link
Member

anton-l commented Aug 29, 2022

Hmm, looks like the .sum() test is still flaky depending on the GPU, maybe that test should use tensor values instead.

@patrickvonplaten
Copy link
Contributor

Ah yeah lemme run the pipeline real quick to make sure this all still works

@@ -714,7 +714,7 @@ def test_full_loop_no_noise(self):
result_sum = torch.sum(torch.abs(sample))
result_mean = torch.mean(torch.abs(sample))

assert abs(result_sum.item() - 14379591680.0) < 1e-2
assert abs(result_sum.item() - 14379589632.0) < 1e-2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Circle CI seemed to be happy with 14379591680.0: https://github.com/huggingface/diffusers/runs/8068605613?check_suite_focus=true#step:6:71 So not sure about this change

=> maybe we should increase the tolerance here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes perhaps that is what i was thinking that too so perhaps abs < 2.5 e3 or so

@@ -714,8 +714,8 @@ def test_full_loop_no_noise(self):
result_sum = torch.sum(torch.abs(sample))
result_mean = torch.mean(torch.abs(sample))

assert abs(result_sum.item() - 14379591680.0) < 1e-2
assert abs(result_mean.item() - 18723426.0) < 1e-3
assert abs(result_sum.item() - 14379591680.0) < 2.5e3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a comment here that says that the results are flaky depending on GPU and this is why we have such high values?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually upon taking a second look I think the reason is that we don't make the test deterministic with generators - both DDPM and Score-VE schedulers make use of torch.randn(...)

@patrickvonplaten
Copy link
Contributor

@kashif, just noticed that most of our tests don't run on GPU - here a fix: #269

Could you maybe rebase your PR to main and check the values again? 😅

@kashif
Copy link
Contributor Author

kashif commented Aug 30, 2022

@natolambert @patrickvonplaten I tested now on a bunch of servers and laptop and all tests (slow ones) are passing for me...

@natolambert
Copy link
Contributor

Great, it has been a bit since I tested them all so hopefully it's more stable now!

Copy link
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be nice if we make the random scheduler tests deterministic by passing the generator correctly. Regarding the pipeline test it'd be nice to reduce the number of steps to speed it up a bit :-)

Think merging: #289 can unblock this PR here

@anton-l
Copy link
Member

anton-l commented Sep 5, 2022

@kashif looks like a couple of VE tests are failing now, could you check please? https://github.com/huggingface/diffusers/runs/8158641999?check_suite_focus=true

When those are back, I think we can carefully merge this :)

@kashif
Copy link
Contributor Author

kashif commented Sep 5, 2022

sure! let me get it over the line

@patrickvonplaten
Copy link
Contributor

Should we still try to get this one merged?

@kashif
Copy link
Contributor Author

kashif commented Sep 9, 2022

yes please! I believe some of the images on the hub might need updating? let me fix the conflict

@patrickvonplaten
Copy link
Contributor

Can we fix the quality checks here and then run all slow tests to be sure?

@patrickvonplaten
Copy link
Contributor

Awesome - thank to make all the changes!

Feel free to merge whenever :-)

@pcuenca pcuenca merged commit f4781a0 into huggingface:main Sep 12, 2022
@kashif kashif deleted the fix-tests branch September 12, 2022 13:49
PhaneeshB pushed a commit to nod-ai/diffusers that referenced this pull request Mar 1, 2023
yoonseokjin pushed a commit to yoonseokjin/diffusers that referenced this pull request Dec 25, 2023
* update expected results of slow tests

* relax sum and mean tests

* Print shapes when reporting exception

* formatting

* fix sentence

* relax test_stable_diffusion_fast_ddim for gpu fp16

* relax flakey tests on GPU

* added comment on large tolerences

* black

* format

* set scheduler seed

* added generator

* use np.isclose

* set num_inference_steps to 50

* fix dep. warning

* update expected_slice

* preprocess if image

* updated expected results

* updated expected from CI

* pass generator to VAE

* undo change back to orig

* use orignal

* revert back the expected on cpu

* revert back values for CPU

* more undo

* update result after using gen

* update mean

* set generator for mps

* update expected on CI server

* undo

* use new seed every time

* cpu manual seed

* reduce num_inference_steps

* style

* use generator for randn

Co-authored-by: Patrick von Platen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants