Skip to content

Commit 9604029

Browse files
committed
Update README.md
1 parent e3ed784 commit 9604029

File tree

1 file changed

+17
-3
lines changed

1 file changed

+17
-3
lines changed

README.md

Lines changed: 17 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -18,11 +18,17 @@ We are keeping on improving the model quality and adding more features.
1818
## Evaluation📊
1919
See [EVAL.md](EVAL.md) for objective evaluation results and comparisons with other baselines.
2020
## Installation📥
21-
Suggested python 3.10 on Windows or Linux.
21+
Suggested python 3.10 on Windows, Mac M Series (Apple Silicon) or Linux.
22+
Windows and Linux:
2223
```bash
2324
pip install -r requirements.txt
2425
```
2526

27+
Mac M Series:
28+
```bash
29+
pip install -r requirements-mac.txt
30+
```
31+
2632
## Usage🛠️
2733
We have released 3 models for different purposes:
2834

@@ -93,8 +99,9 @@ python real-time-gui.py --checkpoint <path-to-checkpoint> --config <path-to-conf
9399
- `checkpoint` is the path to the model checkpoint if you have trained or fine-tuned your own model, leave to blank to auto-download default model from huggingface. (`seed-uvit-tat-xlsr-tiny`)
94100
- `config` is the path to the model config if you have trained or fine-tuned your own model, leave to blank to auto-download default config from huggingface
95101

96-
IMPORTANT: It is strongly recommended to use a GPU for real-time voice conversion.
97-
Some performance testing has been done on a NVIDIA RTX 3060 Laptop GPU, results and recommended parameter settings are listed below:
102+
> [!IMPORTANT]
103+
> It is strongly recommended to use a GPU for real-time voice conversion.
104+
> Some performance testing has been done on a NVIDIA RTX 3060 Laptop GPU, results and recommended parameter settings are listed below:
98105
99106
| Model Configuration | Diffusion Steps | Inference CFG Rate | Max Prompt Length | Block Time (s) | Crossfade Length (s) | Extra context (left) (s) | Extra context (right) (s) | Latency (ms) | Inference Time per Chunk (ms) |
100107
|---------------------------------|-----------------|--------------------|-------------------|----------------|----------------------|--------------------------|---------------------------|--------------|-------------------------------|
@@ -186,8 +193,15 @@ where:
186193
- [x] Colab Notebook for fine-tuning example
187194
- [ ] Replace whisper with more advanced linguistic content extractor
188195
- [ ] More to be added
196+
- [x] Add Apple Silicon support
197+
198+
## Known Issues
199+
- On Mac - running `real-time-gui.py` might raise an error `ModuleNotFoundError: No module named '_tkinter'`, in this case a new Python version **with Tkinter support** should be installed. Refer to [This Guide on stack overflow](https://stackoverflow.com/questions/76105218/why-does-tkinter-or-turtle-seem-to-be-missing-or-broken-shouldnt-it-be-part) for explanation of the problem and a detailed fix.
200+
189201

190202
## CHANGELOGS🗒️
203+
- 2025-03-03:
204+
- Added Mac M Series (Apple Silicon) support
191205
- 2024-11-26:
192206
- Updated v1.0 tiny version pretrained model, optimized for real-time voice conversion
193207
- Support one-shot/few-shot single/multi speaker fine-tuning

0 commit comments

Comments
 (0)