Adopt aligner from "Huang et al., Less Peaky and More Accurate CTC Forced Alignment by Label Priors" #3826

dmitry-mli · 2024-08-21T18:12:25Z

🚀 The feature

Consider on-boarding aligner from Huang et al., Less Peaky and More Accurate CTC Forced Alignment by Label Priors (@huangruizhe) to the existing set of aligners given it improves alignment accuracy compared to the existing Wav2Vec2 CTC aligner by up to 60% P50 on English.

Motivation, pitch

Today, torch audio offers Forced Alignment through a simple extendable interface. The recently published aligner Huang et al., Less Peaky and More Accurate CTC Forced Alignment by Label Priors (github) drives the word boundary error (WBE) down (better) compared to Wav2Vec2. We (@dmitry-mli @jamesr66a @websterbei) explored the model and had WBE for our English samples decrease by up to 60% for P50, 45% for P70 and 15% for P95 compared to Wav2Vec2 CTC alignment.

Alternatives

This request is related to a particular research.

Additional context

Thanks for consideration. @huangruizhe @jamesr66a @websterbei

huangruizhe · 2024-08-21T20:39:02Z

Thanks for your interests in our work and sharing the nice results! As we have been switching between projects, things have been greatly delayed. Regarding the plan, I will be more available in late September and October. I will work on it at that time!

dmitry-mli · 2024-08-28T18:54:37Z

Looking forward to it, thank you!

christincha · 2024-10-24T22:22:39Z

A catch up here, if there is any updated plan for incorporating the Huang et al., Less Peaky and More Accurate CTC Forced Alignment by Label Prior to current Pytorch audio aligner!

huangruizhe · 2024-10-25T08:33:46Z

Hi @christincha, I am still working on it. Before making it official, if you hope to do any experiments, maybe check this out: https://colab.research.google.com/drive/1xciHB1Twi7VFutACrv94-Ejff1-VrjzL?usp=sharing
There are different implementations of the proposed CTC loss, the training recipe, visualization tools as well as a pretrained model.

sairitwik27 · 2025-05-06T10:16:57Z

Hello,
Is there any update on this front? Has the code been integrated into torch or torchaudio with fix to CTC?

dmitry-mli changed the title ~~On-board aligner from "Huang et al., Less Peaky and More Accurate CTC Forced Alignment by Label Priors"~~ Adopt aligner from "Huang et al., Less Peaky and More Accurate CTC Forced Alignment by Label Priors" Aug 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adopt aligner from "Huang et al., Less Peaky and More Accurate CTC Forced Alignment by Label Priors" #3826

Adopt aligner from "Huang et al., Less Peaky and More Accurate CTC Forced Alignment by Label Priors" #3826

dmitry-mli commented Aug 21, 2024 •

edited

Loading

huangruizhe commented Aug 21, 2024

dmitry-mli commented Aug 28, 2024

christincha commented Oct 24, 2024

huangruizhe commented Oct 25, 2024

sairitwik27 commented May 6, 2025 •

edited

Loading

Adopt aligner from "Huang et al., Less Peaky and More Accurate CTC Forced Alignment by Label Priors" #3826

Adopt aligner from "Huang et al., Less Peaky and More Accurate CTC Forced Alignment by Label Priors" #3826

Comments

dmitry-mli commented Aug 21, 2024 • edited Loading

🚀 The feature

Motivation, pitch

Alternatives

Additional context

huangruizhe commented Aug 21, 2024

dmitry-mli commented Aug 28, 2024

christincha commented Oct 24, 2024

huangruizhe commented Oct 25, 2024

sairitwik27 commented May 6, 2025 • edited Loading

dmitry-mli commented Aug 21, 2024 •

edited

Loading

sairitwik27 commented May 6, 2025 •

edited

Loading