Skip to content

Conversation

@wiedymi
Copy link

@wiedymi wiedymi commented Oct 3, 2025

No description provided.

wiedymi and others added 7 commits October 3, 2025 16:08
Implements full subtitle extraction, parsing, muxing and conversion for:
- MP4/MOV: WebVTT (wvtt) support with proper cue extraction
- MKV: SRT, WebVTT, ASS/SSA subtitle track support
- Subtitle codecs: WebVTT, SRT, ASS, SSA with format conversion
- Font attachment handling for ASS/SSA subtitles

Features:
- Subtitle track demuxing from ISOBMFF (MP4/MOV) and Matroska (MKV)
- Parse subtitle formats: WebVTT, SRT, ASS, SSA
- Mux subtitles into MP4/MOV/MKV containers
- Convert between subtitle formats
- Extract and embed fonts for styled subtitles
- Filter empty WebVTT cues (vtte boxes)

Examples:
- Subtitle extraction example with UI
- Subtitle muxing example
- Test file generation script

Tests:
- 63 ISOBMFF subtitle tests
- 127 Matroska subtitle tests
- 128 advanced subtitle tests
- 263 extraction tests
- 205 parsing tests
Implements support for two additional subtitle formats in ISOBMFF containers:
- tx3g (3GPP Timed Text) - Apple's default subtitle format
- stpp (TTML/IMSC) - broadcast/professional standard

Changes:
- Add 'tx3g' and 'ttml' to SUBTITLE_CODECS list
- Map tx3g and stpp box types to codecs in demuxer
- Add tx3g and ttml parsers in SubtitleParser
- Map codecs to box names for muxing
- Generate test samples using ffmpeg and MP4Box

Test coverage:
- 6 new tests for tx3g (MP4/MOV detection, export, cue reading)
- 4 new tests for TTML (MP4/MOV detection, export, cue reading)
- All 68 subtitle tests passing

Sample files:
- test-mp4-tx3g.mp4, test-mov-tx3g.mov
- test-mp4-ttml.mp4, test-mov-ttml.mov

Note: CEA-608/708 (embedded in video SEI) and bitmap subtitles (VobSub/PGS)
require specialized tooling and deferred for future implementation.
Add missing tx3g and ttml mappings in muxers and fix SubtitleParser type:
- isobmff-muxer: Add tx3g and ttml to codec map
- matroska-muxer: Add tx3g/ttml fallback mappings
- subtitles.ts: Change SubtitleParserOptions.codec to SubtitleCodec type
- Import SubtitleCodec type in subtitles.ts

Resolves build errors from subtitle codec additions.
Allows subtitle tracks from input files to be processed during conversion,
with support for codec conversion, track filtering, and trim adjustment.

Features:
- New ConversionSubtitleOptions type with discard and codec properties
- Subtitle option accepts object or function for per-track configuration
- Subtitle cues are extracted, adjusted for trim, and converted to target format
- Fast path uses track.exportToText() when no trim/conversion needed
- Follows same pattern as video/audio track options

Supported conversions:
- WebVTT, SRT, ASS/SSA, TX3G, TTML across compatible containers
- MP4/MOV support: webvtt, tx3g, ttml
- MKV/WebM support: webvtt, srt, ass, ssa

Documentation:
- Added subtitle options section to converting-media-files.md
- Updated compatibility table with all subtitle codec support
- Added track-specific subtitle filtering examples
Adds 24 tests for subtitle conversion in Conversion API covering all
conversion scenarios and edge cases.

Test coverage:
- Basic conversions (SRT ↔ WebVTT, ASS ↔ SRT, etc.) - 6 tests
- Track-specific options with function callbacks - 3 tests
- Trimming with timestamp adjustment - 3 tests
- Codec compatibility across formats - 6 tests
- External subtitle track addition - 3 tests
- Edge cases (empty tracks, validation, etc.) - 3 tests

Fixes for ASS/SSA subtitle conversion:
- Generate default ASS header when converting from non-ASS formats
  Includes [Script Info], [V4+ Styles], and [Events] sections
- Preserve ASS header when converting ASS → ASS with trimming
- Strip ASS metadata when converting ASS → SRT/WebVTT
  Extract only text field from Dialogue lines, removing Layer/Style/Margin fields
- Handle plain text cues by creating proper ASS Dialogue format
  Format: Dialogue: 0,Start,End,Default,,0,0,0,,<text>

New helper function:
- extractTextFromAssCue(): Extracts plain text from ASS Dialogue/Comment lines
  Handles full ASS format, MKV with ReadOrder, and MKV without ReadOrder

All tests pass (92 total): SRT ↔ WebVTT ↔ ASS conversions with proper
text extraction and metadata handling.
Fix issues with ASS subtitle export where text would start with comma
or contain duplicate field data due to incorrect format detection.

Changes:
- Add convertDialogueLineToMkvFormat helper to strip Dialogue prefix
  and timestamps before writing to MKV blocks
- Fix formatCuesToAss to use dynamic format parsing with field detection
- Adjust textIndex calculation for MKV blocks (subtract Start/End fields)
- Handle ReadOrder, standard, and plain text ASS format variants

Add edge case tests:
- Text starting with comma
- Text containing multiple commas
- MKV format with ReadOrder field
- Round-trip ASS conversion
- Empty fields preservation
- Format detection with different field structures

Fixes round-trip conversion failures and compatibility with mkvmerge/mkvextract.
@wiedymi wiedymi marked this pull request as ready for review October 6, 2025 08:27
@wiedymi wiedymi changed the title [WIP] Extended subtitle support across containers Extended subtitles support across containers Oct 6, 2025
Add default values for layer field assignment to handle potential
undefined array access.
…eadOrder field

Detect ReadOrder format from actual MKV block data structure instead of
relying on Format line field list. This prevents extra fields in Dialogue
lines when MKV uses ReadOrder but Format line doesn't mention it.

Fixes Aegisub "bad lexical cast" error.
@Vanilagy
Copy link
Owner

Thanks! This is a large PR and thus will take some energy to review, so given that this feature is not on the immediate roadmap (although it is on the roadmap!), I'll deal with this PR at a later date. But don't worry, I won't forget about it!

@wiedymi
Copy link
Author

wiedymi commented Nov 10, 2025

Take your time

Thanks! This is a large PR and thus will take some energy to review, so given that this feature is not on the immediate roadmap (although it is on the roadmap!), I'll deal with this PR at a later date. But don't worry, I won't forget about it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants