Extended subtitles support across containers #166

wiedymi · 2025-10-03T17:17:30Z

No description provided.

Implements full subtitle extraction, parsing, muxing and conversion for: - MP4/MOV: WebVTT (wvtt) support with proper cue extraction - MKV: SRT, WebVTT, ASS/SSA subtitle track support - Subtitle codecs: WebVTT, SRT, ASS, SSA with format conversion - Font attachment handling for ASS/SSA subtitles Features: - Subtitle track demuxing from ISOBMFF (MP4/MOV) and Matroska (MKV) - Parse subtitle formats: WebVTT, SRT, ASS, SSA - Mux subtitles into MP4/MOV/MKV containers - Convert between subtitle formats - Extract and embed fonts for styled subtitles - Filter empty WebVTT cues (vtte boxes) Examples: - Subtitle extraction example with UI - Subtitle muxing example - Test file generation script Tests: - 63 ISOBMFF subtitle tests - 127 Matroska subtitle tests - 128 advanced subtitle tests - 263 extraction tests - 205 parsing tests

Implements support for two additional subtitle formats in ISOBMFF containers: - tx3g (3GPP Timed Text) - Apple's default subtitle format - stpp (TTML/IMSC) - broadcast/professional standard Changes: - Add 'tx3g' and 'ttml' to SUBTITLE_CODECS list - Map tx3g and stpp box types to codecs in demuxer - Add tx3g and ttml parsers in SubtitleParser - Map codecs to box names for muxing - Generate test samples using ffmpeg and MP4Box Test coverage: - 6 new tests for tx3g (MP4/MOV detection, export, cue reading) - 4 new tests for TTML (MP4/MOV detection, export, cue reading) - All 68 subtitle tests passing Sample files: - test-mp4-tx3g.mp4, test-mov-tx3g.mov - test-mp4-ttml.mp4, test-mov-ttml.mov Note: CEA-608/708 (embedded in video SEI) and bitmap subtitles (VobSub/PGS) require specialized tooling and deferred for future implementation.

Add missing tx3g and ttml mappings in muxers and fix SubtitleParser type: - isobmff-muxer: Add tx3g and ttml to codec map - matroska-muxer: Add tx3g/ttml fallback mappings - subtitles.ts: Change SubtitleParserOptions.codec to SubtitleCodec type - Import SubtitleCodec type in subtitles.ts Resolves build errors from subtitle codec additions.

Allows subtitle tracks from input files to be processed during conversion, with support for codec conversion, track filtering, and trim adjustment. Features: - New ConversionSubtitleOptions type with discard and codec properties - Subtitle option accepts object or function for per-track configuration - Subtitle cues are extracted, adjusted for trim, and converted to target format - Fast path uses track.exportToText() when no trim/conversion needed - Follows same pattern as video/audio track options Supported conversions: - WebVTT, SRT, ASS/SSA, TX3G, TTML across compatible containers - MP4/MOV support: webvtt, tx3g, ttml - MKV/WebM support: webvtt, srt, ass, ssa Documentation: - Added subtitle options section to converting-media-files.md - Updated compatibility table with all subtitle codec support - Added track-specific subtitle filtering examples

Adds 24 tests for subtitle conversion in Conversion API covering all conversion scenarios and edge cases. Test coverage: - Basic conversions (SRT ↔ WebVTT, ASS ↔ SRT, etc.) - 6 tests - Track-specific options with function callbacks - 3 tests - Trimming with timestamp adjustment - 3 tests - Codec compatibility across formats - 6 tests - External subtitle track addition - 3 tests - Edge cases (empty tracks, validation, etc.) - 3 tests Fixes for ASS/SSA subtitle conversion: - Generate default ASS header when converting from non-ASS formats Includes [Script Info], [V4+ Styles], and [Events] sections - Preserve ASS header when converting ASS → ASS with trimming - Strip ASS metadata when converting ASS → SRT/WebVTT Extract only text field from Dialogue lines, removing Layer/Style/Margin fields - Handle plain text cues by creating proper ASS Dialogue format Format: Dialogue: 0,Start,End,Default,,0,0,0,,<text> New helper function: - extractTextFromAssCue(): Extracts plain text from ASS Dialogue/Comment lines Handles full ASS format, MKV with ReadOrder, and MKV without ReadOrder All tests pass (92 total): SRT ↔ WebVTT ↔ ASS conversions with proper text extraction and metadata handling.

Fix issues with ASS subtitle export where text would start with comma or contain duplicate field data due to incorrect format detection. Changes: - Add convertDialogueLineToMkvFormat helper to strip Dialogue prefix and timestamps before writing to MKV blocks - Fix formatCuesToAss to use dynamic format parsing with field detection - Adjust textIndex calculation for MKV blocks (subtract Start/End fields) - Handle ReadOrder, standard, and plain text ASS format variants Add edge case tests: - Text starting with comma - Text containing multiple commas - MKV format with ReadOrder field - Round-trip ASS conversion - Empty fields preservation - Format detection with different field structures Fixes round-trip conversion failures and compatibility with mkvmerge/mkvextract.

Add default values for layer field assignment to handle potential undefined array access.

…eadOrder field Detect ReadOrder format from actual MKV block data structure instead of relying on Format line field list. This prevents extra fields in Dialogue lines when MKV uses ReadOrder but Format line doesn't mention it. Fixes Aegisub "bad lexical cast" error.

Vanilagy · 2025-11-10T14:37:09Z

Thanks! This is a large PR and thus will take some energy to review, so given that this feature is not on the immediate roadmap (although it is on the roadmap!), I'll deal with this PR at a later date. But don't worry, I won't forget about it!

wiedymi · 2025-11-10T18:56:12Z

Take your time

Thanks! This is a large PR and thus will take some energy to review, so given that this feature is not on the immediate roadmap (although it is on the roadmap!), I'll deal with this PR at a later date. But don't worry, I won't forget about it!

wiedymi and others added 7 commits October 3, 2025 16:08

Merge branch 'main' into subtitle-support

f4f57b8

wiedymi marked this pull request as ready for review October 6, 2025 08:27

wiedymi changed the title ~~[WIP] Extended subtitle support across containers~~ Extended subtitles support across containers Oct 6, 2025

wiedymi added 2 commits October 6, 2025 16:30

Fix TypeScript errors in ASS format parsing

81c3300

Add default values for layer field assignment to handle potential undefined array access.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Extended subtitles support across containers #166

Extended subtitles support across containers #166

Uh oh!

wiedymi commented Oct 3, 2025

Uh oh!

Vanilagy commented Nov 10, 2025

Uh oh!

wiedymi commented Nov 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Extended subtitles support across containers #166

Are you sure you want to change the base?

Extended subtitles support across containers #166

Uh oh!

Conversation

wiedymi commented Oct 3, 2025

Uh oh!

Vanilagy commented Nov 10, 2025

Uh oh!

wiedymi commented Nov 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants