aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment). aeneas automatically generates a synchronization map between a list of text fragments and an audio file containing the narration of the text. In computer science this task is known as (automatically computing a) forced alignment.
Features
- aeneas has been developed and tested on Debian 64bit, with Python 2.7 and Python 3.5, which are the only supported platforms at the moment
- Documentation available
- All-in-one installers are available for Mac OS X and Windows
- Input text files in parsed, plain, subtitles, or unparsed (XML) format
- Multilevel input text files in mplain and munparsed (XML) format
- Text extraction from XML (e.g., XHTML) files using id and class attributes
- Arbitrary text fragment granularity (single word, subphrase, phrase, paragraph, etc.)
- Input audio file formats: all those readable by ffmpeg
- MFCC and DTW computed via Python C extensions to reduce the processing time
- Robust against misspelled/mispronounced words, local rearrangements of words, background noise/sporadic spikes
- Adjustable splitting times, including a max character/second constraint for CC applications
Categories
LibrariesLicense
Affero GNU Public LicenseFollow aeneas
Other Useful Business Software
Gen AI apps are built with MongoDB Atlas
MongoDB Atlas provides built-in vector search and a flexible document model so developers can build, scale, and run gen AI apps without stitching together multiple databases. From LLM integration to semantic search, Atlas simplifies your AI architecture—and it’s free to get started.
Rate This Project
Login To Rate This Project
User Reviews
Be the first to post a review of aeneas!