Skip to content

add opencv benchmark #674

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

jinhohwang-meta
Copy link

This PR added the benchmark for opencv. The benchmark shows that there is gap. Hope the torchcodec team can find out where the gap is from and improve the decoding performance.

Local benchmark

|  decode 10 uniform frames  |  decode 10 random frames  |  first 1 frames  |  first 10 frames  |  first 100 frames
1 threads: --------------------------------------------------------------------------------------------------------------------------------
      OpenCV            |            22.4            |            22.6           |       6.4        |        9.3        |        18.1
      TorchCodecPublic  |            39.4            |            28.1           |       9.5        |        9.8        |        26.0

Times are in milliseconds (ms).

Image

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label May 9, 2025
Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the benchmark @jinhohwang-meta , that's very useful! I made a few comments below.

Based on how the code is converting pts to indices (using the average fps), I think a fairer comparison against torchcodec would be to benchmark torchcodec with the approximate mode (instead of exact mode). I think it corresponds to torchcodec_core_nonbatch, although I'm not 100% sure (CC @scotts )

raise ValueError("Could not open video stream")

fps = cap.get(cv2.CAP_PROP_FPS)
frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

frames seems unused and gets redefined below

Comment on lines +157 to +160
if not vr.isBackendBuiltIn(backend):
_, abi, api = vr.getStreamBufferedBackendPluginVersion(backend)
if (abi < 1 or (abi == 1 and api < 2)):
continue
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment explaining why this is needed?

Comment on lines 180 to 181
if not ok:
break
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's raise here instead of a break. This will ensure that we are not reporting results in case cap.grab() fails.


fps = cap.get(cv2.CAP_PROP_FPS)
frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
approx_frame_numbers = [int(pts * fps) for pts in pts_list]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: s/numbers/indices. We use the term indices for this.

for i in range(n):
ok = cap.grab()
if not ok:
break
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, we should raise instead of break

ret, frame = cap.retrieve()
if ret:
frames.append(frame)
cap.release()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's assert that len(frames) == n to ensure no error went undetected

Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the benchmark @jinhohwang-meta , that's very useful! I made a few comments below.

Based on how the code is converting pts to indices (using the average fps), I think a fairer comparison against torchcodec would be to benchmark torchcodec with the approximate mode (instead of exact mode). I think it corresponds to torchcodec_core_nonbatch, although I'm not 100% sure (CC @scotts )

@scotts
Copy link
Contributor

scotts commented May 12, 2025

The comparison we'll probably be most interested in is this one:

python benchmarks/decoders/benchmark_decoders.py --decoders opencv,torchcodec_public:seek_mode=exact,torchcodec_public:seek_mode=approximate,torchaudio --min-run-seconds 40

That is, it compares:

  1. OpenCV
  2. TorchCodec public API, exact seek mode
  3. TorchCodec public API, approximate seek mode
  4. TorchAudio

We'll also want to use some more videos - the ones that are part of the README benchmark are good to use. We generate those on the fly here:

resolutions = ["1920x1080"]
encodings = ["libx264"]
patterns = ["mandelbrot"]
fpses = [60]
gop_sizes = [600]
durations = [10, 120]
pix_fmts = ["yuv420p"]
ffmpeg_path = "ffmpeg"
generate_videos(
resolutions,
encodings,
patterns,
fpses,
gop_sizes,
durations,
pix_fmts,
ffmpeg_path,
videos_dir_path,
)

That doesn't need to be a part of this PR. In this PR, we should focus on correct usage of the OpenCV API and then we can investigate the comparative performance in follow-ups. Just making a note. :)

@jinhohwang-meta
Copy link
Author

Thanks a lot for the benchmark @jinhohwang-meta , that's very useful! I made a few comments below.

Based on how the code is converting pts to indices (using the average fps), I think a fairer comparison against torchcodec would be to benchmark torchcodec with the approximate mode (instead of exact mode). I think it corresponds to torchcodec_core_nonbatch, although I'm not 100% sure (CC @scotts )

@NicolasHug I actually did compare with approximate as well, but it showed similar results. @scotts asked me to push this for the team to follow on to find out the gap and I am very much interested in to see why as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants