You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
这里是一个简单的Colab示例以供参考: [](https://colab.research.google.com/drive/1R1BJTqMsTXZzYAVx3j1BiemFXog9pbQG?usp=sharing)
Copy file name to clipboardExpand all lines: README.md
+4-1Lines changed: 4 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -112,6 +112,7 @@ Fine-tuning on custom data allow the model to clone someone's voice more accurat
112
112
A Colab Tutorial is here for you to follow: [](https://colab.research.google.com/drive/1R1BJTqMsTXZzYAVx3j1BiemFXog9pbQG?usp=sharing)
113
113
1. Prepare your own dataset. It has to satisfy the following:
114
114
- File structure does not matter
115
+
- Each audio file should range from 1 to 30 seconds, otherwise will be ignored
115
116
- All audio files should be in on of the following formats: `.wav``.flac``.mp3``.m4a``.opus``.ogg`
116
117
- Speaker label is not required, but make sure that each speaker has at least 1 utterance
117
118
- Of course, the more data you have, the better the model will perform
@@ -143,7 +144,9 @@ where:
143
144
-`save-every` is the number of steps to save the model checkpoint
144
145
-`num-workers` is the number of workers for data loading, set to 0 for Windows
145
146
146
-
4. After training, you can use the trained model for inference by specifying the path to the checkpoint and config file.
147
+
4. If training accidentially stops, you can resume training by running the same command again, the training will continue from the last checkpoint. (Make sure `run-name` and `config` arguments are the same so that latest checkpoint can be found)
148
+
149
+
5. After training, you can use the trained model for inference by specifying the path to the checkpoint and config file.
147
150
- They should be under `./runs/<run-name>/`, with the checkpoint named `ft_model.pth` and config file with the same name as the training config file.
148
151
- You still have to specify a reference audio file of the speaker you'd like to use during inference, similar to zero-shot usage.
0 commit comments