Skip to content

Allow audio decoder to seek backwards #550

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Mar 12, 2025

Conversation

NicolasHug
Copy link
Member

@NicolasHug NicolasHug commented Mar 12, 2025

Towards #549.

Changes are quite subtle, but I'm fairly confident they work as expected, since they pass our robust tests. I'll make sure to write some solid documentation around the mechanisms involved eventually.

Benchmarks results show no perf hit

Duration: 13s
torchcodec: med = 8.05ms +- 1.13
torchaudio: med = 12.27ms +- 0.51

Duration: 13s
torchcodec: med = 4.20ms +- 0.91
torchaudio: med = 7.21ms +- 0.58

Duration: 2m11s
torchcodec: med = 28.26ms +- 0.95
torchaudio: med = 45.72ms +- 0.89

Duration: 1h27m
torchcodec: med = 1046.49ms +- 66.00
torchaudio: med = 1746.49ms +- 22.55

Benchmark code is the same as in #538, I'm not benchmarking the "backwards seeking" logic.

@NicolasHug NicolasHug requested a review from scotts March 12, 2025 14:49
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Mar 12, 2025
# indices.
# Ultimately, this test compares a "stateful decoder" which calls
# `get_frames_by_pts_in_range_audio()`` multiple times with a
# "stateless decoder" (the one here, treated as the reference)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've convinced myself that we should actually keep this helper instead of doing the conversion. Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's fine - then what we're testing is that decoder behaves the same when it seeks to a location "fresh" versus having to seek from some given location, including backwards. That seems reasonable.

@NicolasHug NicolasHug mentioned this pull request Mar 12, 2025
7 tasks
// of the stream.
// TODO-AUDIO: document why this is needed in a big comment.
setCursorPtsInSeconds(INT64_MIN);
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this is INT64_MIN and not 0, because some packets actually start before 0. In one of our assets the first packet is at -1024.
I noticed that passing an arbitrary low value like -999999 makes FFmpeg unhappy and raise and error, but INT64_MIN seems to be understood and correct (although I haven't found docs on this).

@NicolasHug NicolasHug merged commit c6de04a into pytorch:main Mar 12, 2025
46 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants