-
Notifications
You must be signed in to change notification settings - Fork 619
Add code to Bundleio to generate error stats #12051
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add code to Bundleio to generate error stats #12051
Conversation
Signed-off-by: Zingo Andersen <[email protected]> Change-Id: Ib51b22c80954c87812b81b6fa9798ace705a555a
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/12051
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ✅ No FailuresAs of commit 49542cb with merge base 142b1c6 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Hi @digantdesai and @mergennachin this PR will add a new method/API to BundleIO and maybe you want to involve the proper people about this :) Also is the metric sane? And the way to propagate it back to the runner? My basics intentions is to be able to log/track models over time in a better way then PASS/FAIL on a set atol/rtol, as it easily miss if we could have set rtol/atol lower when improving stuff. Something like this is also useful to get a good guess of what atol/rtol could be to make it work instead of a lot of trail and error with different values. |
double abs_err = std::abs(a_data[k] - e_data[k]); | ||
double relative_divider = | ||
std::max(std::abs(a_data[k]), std::abs(e_data[k])); | ||
relative_divider = std::max(relative_divider, eps); | ||
double relative_err = abs_err / relative_divider; | ||
|
||
sum_abs += abs_err; | ||
max_abs = std::max(max_abs, abs_err); | ||
sum_rel += relative_err; | ||
max_rel = std::max(max_rel, relative_err); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this good? I'm no ML-math-stats person so if this can be improved we should in PR or after :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is good for regular cases e.g. both input and target are in float32 and good enough for bundled program. We shouldn't expect that bundled program can cover all cases like quantization
I like this general direction of getting more than true/false. And since its opt in, we care a bit less about the binary size overhead. I will let @Gasoonjia weigh in. He is looking at this for AoT with multiple "distance" measures. Thanks @zingo. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thansk @zingo for the update and overall l love the update. It makes the error msg more meaningful.
Also spoiler alert we will have a new api in 0.7 release (https://github.com/pytorch/executorch/blob/main/devtools/inspector/_inspector.py#L1365) for comparing intermediate output in operator-level, beyond what we have right now in bundled program for only compare the final result! Stay tune and we will have a doc for better demonstration!
Add a way to get error stats/metrics between actual and reference output. cc @digantdesai @freddan80 @per @oscarandersson8218 Signed-off-by: Zingo Andersen <[email protected]>
Thank you both. @zingo or @Gasoonjia we should use this |
Add a way to get error stats/metrics between actual and reference output.
cc @digantdesai @freddan80 @per @oscarandersson8218