You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
-`checkpoint` is the path to the model checkpoint if you have trained or fine-tuned your own model, leave to blank to auto-download default model from huggingface. (`seed-uvit-tat-xlsr-tiny`)
94
100
-`config` is the path to the model config if you have trained or fine-tuned your own model, leave to blank to auto-download default config from huggingface
95
101
96
-
IMPORTANT: It is strongly recommended to use a GPU for real-time voice conversion.
97
-
Some performance testing has been done on a NVIDIA RTX 3060 Laptop GPU, results and recommended parameter settings are listed below:
102
+
> [!IMPORTANT]
103
+
> It is strongly recommended to use a GPU for real-time voice conversion.
104
+
> Some performance testing has been done on a NVIDIA RTX 3060 Laptop GPU, results and recommended parameter settings are listed below:
98
105
99
106
| Model Configuration | Diffusion Steps | Inference CFG Rate | Max Prompt Length | Block Time (s) | Crossfade Length (s) | Extra context (left) (s) | Extra context (right) (s) | Latency (ms) | Inference Time per Chunk (ms) |
-[ ] Replace whisper with more advanced linguistic content extractor
188
195
-[ ] More to be added
196
+
-[x] Add Apple Silicon support
197
+
198
+
## Known Issues
199
+
- On Mac - running `real-time-gui.py` might raise an error `ModuleNotFoundError: No module named '_tkinter'`, in this case a new Python version **with Tkinter support** should be installed. Refer to [This Guide on stack overflow](https://stackoverflow.com/questions/76105218/why-does-tkinter-or-turtle-seem-to-be-missing-or-broken-shouldnt-it-be-part) for explanation of the problem and a detailed fix.
200
+
189
201
190
202
## CHANGELOGS🗒️
203
+
- 2025-03-03:
204
+
- Added Mac M Series (Apple Silicon) support
191
205
- 2024-11-26:
192
206
- Updated v1.0 tiny version pretrained model, optimized for real-time voice conversion
193
207
- Support one-shot/few-shot single/multi speaker fine-tuning
0 commit comments