add opencv benchmark #674

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

jinhohwang-meta wants to merge 2 commits into pytorch:main from jinhohwang-meta:benchmark_opencv

jinhohwang-meta commented May 9, 2025

This PR added the benchmark for opencv. The benchmark shows that there is gap. Hope the torchcodec team can find out where the gap is from and improve the decoding performance.

Local benchmark

|  decode 10 uniform frames  |  decode 10 random frames  |  first 1 frames  |  first 10 frames  |  first 100 frames
1 threads: --------------------------------------------------------------------------------------------------------------------------------
      OpenCV            |            22.4            |            22.6           |       6.4        |        9.3        |        18.1
      TorchCodecPublic  |            39.4            |            28.1           |       9.5        |        9.8        |        26.0

Times are in milliseconds (ms).


          add opencv benchmark

406e336

facebook-github-bot added the CLA Signed label

NicolasHug reviewed

View reviewed changes

Member

NicolasHug left a comment

Thanks a lot for the benchmark @jinhohwang-meta , that's very useful! I made a few comments below.

Based on how the code is converting pts to indices (using the average fps), I think a fairer comparison against torchcodec would be to benchmark torchcodec with the approximate mode (instead of exact mode). I think it corresponds to torchcodec_core_nonbatch, although I'm not 100% sure (CC @scotts )

benchmarks/decoders/benchmark_decoders_library.py Outdated

+                          raise ValueError("Could not open video stream")
+                      fps = cap.get(cv2.CAP_PROP_FPS)
+                      frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))

Member

NicolasHug May 12, 2025

frames seems unused and gets redefined below

benchmarks/decoders/benchmark_decoders_library.py

Comment on lines +157 to +160

+                          if not vr.isBackendBuiltIn(backend):
+                              _, abi, api = vr.getStreamBufferedBackendPluginVersion(backend)
+                              if (abi < 1 or (abi == 1 and api < 2)):
+                                  continue

Member

NicolasHug May 12, 2025

Can you add a comment explaining why this is needed?

benchmarks/decoders/benchmark_decoders_library.py Outdated

Comment on lines 180 to 181

		if not ok:
		break

Member

NicolasHug May 12, 2025

Let's raise here instead of a break. This will ensure that we are not reporting results in case cap.grab() fails.

benchmarks/decoders/benchmark_decoders_library.py Outdated

+                      fps = cap.get(cv2.CAP_PROP_FPS)
+                      frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
+                      approx_frame_numbers = [int(pts * fps) for pts in pts_list]

Member

NicolasHug May 12, 2025

Nit: s/numbers/indices. We use the term indices for this.

benchmarks/decoders/benchmark_decoders_library.py Outdated

+                      for i in range(n):
+                          ok = cap.grab()
+                          if not ok:
+                              break

Member

NicolasHug May 12, 2025

Same here, we should raise instead of break

benchmarks/decoders/benchmark_decoders_library.py

+                          ret, frame = cap.retrieve()
+                          if ret:
+                              frames.append(frame)
+                      cap.release()

Member

NicolasHug May 12, 2025

Let's assert that len(frames) == n to ensure no error went undetected

NicolasHug reviewed

View reviewed changes

Member

NicolasHug left a comment

Thanks a lot for the benchmark @jinhohwang-meta , that's very useful! I made a few comments below.

Based on how the code is converting pts to indices (using the average fps), I think a fairer comparison against torchcodec would be to benchmark torchcodec with the approximate mode (instead of exact mode). I think it corresponds to torchcodec_core_nonbatch, although I'm not 100% sure (CC @scotts )

Contributor

scotts commented May 12, 2025 •

edited

Loading

The comparison we'll probably be most interested in is this one:

python benchmarks/decoders/benchmark_decoders.py --decoders opencv,torchcodec_public:seek_mode=exact,torchcodec_public:seek_mode=approximate,torchaudio --min-run-seconds 40

That is, it compares:

OpenCV
TorchCodec public API, exact seek mode
TorchCodec public API, approximate seek mode
TorchAudio

We'll also want to use some more videos - the ones that are part of the README benchmark are good to use. We generate those on the fly here:

torchcodec/benchmarks/decoders/generate_readme_data.py

Lines 37 to 55 in 87b98e8

    
           resolutions = ["1920x1080"] 
        
           encodings = ["libx264"] 
        
           patterns = ["mandelbrot"] 
        
           fpses = [60] 
        
           gop_sizes = [600] 
        
           durations = [10, 120] 
        
           pix_fmts = ["yuv420p"] 
        
           ffmpeg_path = "ffmpeg" 
        
           generate_videos( 
        
               resolutions, 
        
               encodings, 
        
               patterns, 
        
               fpses, 
        
               gop_sizes, 
        
               durations, 
        
               pix_fmts, 
        
               ffmpeg_path, 
        
               videos_dir_path, 
        
           )

That doesn't need to be a part of this PR. In this PR, we should focus on correct usage of the OpenCV API and then we can investigate the comparative performance in follow-ups. Just making a note. :)

Author

jinhohwang-meta commented May 13, 2025

Thanks a lot for the benchmark @jinhohwang-meta , that's very useful! I made a few comments below.

Based on how the code is converting pts to indices (using the average fps), I think a fairer comparison against torchcodec would be to benchmark torchcodec with the approximate mode (instead of exact mode). I think it corresponds to torchcodec_core_nonbatch, although I'm not 100% sure (CC @scotts )

@NicolasHug I actually did compare with approximate as well, but it showed similar results. @scotts asked me to push this for the team to follow on to find out the gap and I am very much interested in to see why as well.


          reflect comments

f85361a

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels