Name	Name	Last commit message	Last commit date
Latest commit History 15 Commits
data	data
fig	fig
preprocessed	preprocessed
samples	samples
LICENSE	LICENSE
README.md	README.md
data.py	data.py
data_load.py	data_load.py
eval.py	eval.py
hyperparams.py	hyperparams.py
modules.py	modules.py
networks.py	networks.py
prepro.py	prepro.py
train.py	train.py
train_multi_gpus.py	train_multi_gpus.py
utils.py	utils.py

Name

Last commit message

Last commit date

data

Speech Recognition Using Tacotron

Motivation

Tacotron is an end-to-end speech generation model which was first introduced in Towards End-to-End Speech Synthesis. It takes as input text at the character level, and targets mel filterbanks and the linear spectrogram. Although it is a generation model, I felt like testing how well it can be applied to the speech recognition task.

Requirements

NumPy >= 1.11.1
TensorFlow == 1.1
librosa

Data

I use the VCTK Corpus, one of the most popular speech corpora, for my experiment. Because there's no pre-defined split of training and evaluation, 10*(mini batch) samples that don't appear in the training set are reserved for evaluation.

hyperparams.py includes all hyper parameters.
prepro.py creates training and evaluation data to data/ folder.
data_load.py loads data and put them in queues so multiple mini-bach data are generated in parallel.
utils.py has some operational functions.
modules.py contains building blocks for encoding and decoding networks.
networks.py defines encoding and decoding networks.
train.py executes training.
eval.py executes evaluation.

Training

STEP 1. Download and extract VCTK Corpus and adjust the value of 'vctk' in hyperparams.py.
STEP 2. Adjust other hyper parameters in hyperparams.py if necessary.
STEP 3. Run train_multiple_gpus.py if you want to use more than one gpu, otherwise train.py.

Evaluation

Run eval.py to get speech recognition results for the test set.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Speech Recognition Using Tacotron

Motivation

Requirements

Data

Contents

Training

Evaluation

Related projects

About

Uh oh!

Releases

Packages

Languages

License

sunxh16/tacotron_asr

Folders and files

Latest commit

History

Repository files navigation

Speech Recognition Using Tacotron

Motivation

Requirements

Data

Contents

Training

Evaluation

Related projects

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages