-
Notifications
You must be signed in to change notification settings - Fork 35
Use psnr to compare frames #662
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This commit brings dependency from torcheval in tests. Signed-off-by: Dmitry Rogozhkin <[email protected]>
Thank you for the PR @dvrogozh , some high-level comments:
|
Thank you for your comments, @NicolasHug. This PR touches important topic on the approach used in torchcodec tests to validate correctness of video frames. I understand your motivation to use straight forward and simple bit exactness metrics. This indeed easiest thing to do. And that would work for most of decoders (at least most important once like avc, hevc, av1, etc.) - those who don't use floating point operations and are required to be conformant. The problems comes with video processing algorithms such as color conversion or scaling and further multiples when you start consider video encoding. That's basically a given fact that you can't use simple bit-to-bit exactness metric across different implementation having single truth baseline (taken from cpu or cuda or any other processor). To properly compare video frames you must use specific metrics such as PSNR, SSIM, MS-SSIM. My concern with the current torchcodec tests design is that the project will step into issues on a long run when code base and features will grow potentially spanning across multiple implementations on different platforms. Even as of today you handle video frames comparison with ~5 different paths, 2 for cuda, 2 for cpus and non-cuda, 1 for even more complex cases ( Line 113 in ceeeb11
I afraid that going forward there might be real issue with project sustainability. By intent I did this PR in a blunt way from the sense of how I would approach this task on the day 0. Project is eventually past the day 0 and I totally understand desire to preserve current baseline checks. I think however we can still discuss this story and come to the better solution and help project maintenance in the future. I would like to propose the following:
In this way, 1st item will give us a baseline which all implementations of torchcodec should pass. Further, 2nd item allows to strengthen the requirements for particular cases. The
That's implementation details :). There is indeed a question whether
That would be reinventing the wheel, i.e. attempt to define a metric while they already exist and used for media for decades - PSNR/SSIM/MS-SSIM. |
Thanks for your input @dvrogozh I think we need to answer to distinct problems:
Regarding problem 1: I would be happy to include PSNR (and potentially the other metrics you mentioned) as an additional check, but I wouldn't make it a replacement. The reason I want to keep our existing checks is that PSNR is complementary to our current checks, but it's not inherently stricter, more complete, or generally better. PSNR is derived from MSE, while our existing checks e.g. the MacOS are bounding the max abs diff. It asserts a different property from PSNR (and it is very intuitive), so there's value in having both. I'd love to review a PR that adds PSNR to the checks we run in Regarding problem 2: I think this is the actual problem we need to solve in order to enable your XPU out-of-core work. I think it's distinct and largely orthogonal to problem 1. I'd love to brainstorm ideas with you on this. |
@NicolasHug : I am fine with that direction. I will rework this PR once #636 will land and I'll rebase #558 with XPU support - this will allow me to verify changes via XPU path. |
For: #569
With attempts to support XPU acceleration (Intel GPUs) and aarch64 we see the need to generalize frames comparison in a way which would work across different platforms and implementations of color space conversion algorithms. In media domain such comparisons are being achieved by using metrics such as PSNR or SSIM. This commit introduces such comparison for all the platforms. Commit brings dependency from
torcheval
in tests which is used to calculate PSNR.CC: @traversaro, @scotts, @NicolasHug