-
Notifications
You must be signed in to change notification settings - Fork 35
add opencv benchmark #674
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
add opencv benchmark #674
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for the benchmark @jinhohwang-meta , that's very useful! I made a few comments below.
Based on how the code is converting pts to indices (using the average fps), I think a fairer comparison against torchcodec would be to benchmark torchcodec with the approximate mode (instead of exact mode). I think it corresponds to torchcodec_core_nonbatch
, although I'm not 100% sure (CC @scotts )
raise ValueError("Could not open video stream") | ||
|
||
fps = cap.get(cv2.CAP_PROP_FPS) | ||
frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
frames
seems unused and gets redefined below
if not vr.isBackendBuiltIn(backend): | ||
_, abi, api = vr.getStreamBufferedBackendPluginVersion(backend) | ||
if (abi < 1 or (abi == 1 and api < 2)): | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a comment explaining why this is needed?
if not ok: | ||
break |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's raise here instead of a break
. This will ensure that we are not reporting results in case cap.grab()
fails.
|
||
fps = cap.get(cv2.CAP_PROP_FPS) | ||
frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT)) | ||
approx_frame_numbers = [int(pts * fps) for pts in pts_list] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: s/numbers/indices
. We use the term indices
for this.
for i in range(n): | ||
ok = cap.grab() | ||
if not ok: | ||
break |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, we should raise instead of break
ret, frame = cap.retrieve() | ||
if ret: | ||
frames.append(frame) | ||
cap.release() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's assert that len(frames) == n
to ensure no error went undetected
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for the benchmark @jinhohwang-meta , that's very useful! I made a few comments below.
Based on how the code is converting pts to indices (using the average fps), I think a fairer comparison against torchcodec would be to benchmark torchcodec with the approximate mode (instead of exact mode). I think it corresponds to torchcodec_core_nonbatch
, although I'm not 100% sure (CC @scotts )
The comparison we'll probably be most interested in is this one:
That is, it compares:
We'll also want to use some more videos - the ones that are part of the README benchmark are good to use. We generate those on the fly here: torchcodec/benchmarks/decoders/generate_readme_data.py Lines 37 to 55 in 87b98e8
That doesn't need to be a part of this PR. In this PR, we should focus on correct usage of the OpenCV API and then we can investigate the comparative performance in follow-ups. Just making a note. :) |
@NicolasHug I actually did compare with approximate as well, but it showed similar results. @scotts asked me to push this for the team to follow on to find out the gap and I am very much interested in to see why as well. |
This PR added the benchmark for opencv. The benchmark shows that there is gap. Hope the torchcodec team can find out where the gap is from and improve the decoding performance.
Local benchmark
Times are in milliseconds (ms).