Skip to content

Commit 3e152dc

Browse files
committed
update V2 model parameter count
1 parent 194d28b commit 3e152dc

File tree

1 file changed

+6
-6
lines changed

1 file changed

+6
-6
lines changed

README.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -37,12 +37,12 @@ pip install triton-windows==3.2.0.post13
3737
## Usage🛠️
3838
We have released 4 models for different purposes:
3939

40-
| Version | Name | Purpose | Sampling Rate | Content Encoder | Vocoder | Hidden Dim | N Layers | Params | Remarks |
41-
|---------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------|---------------|------------------------------------------------------------------------|---------|------------|----------|--------|--------------------------------------------------------|
42-
| v1.0 | seed-uvit-tat-xlsr-tiny ([🤗](https://huggingface.co/Plachta/Seed-VC/blob/main/DiT_uvit_tat_xlsr_ema.pth)[📄](configs/presets/config_dit_mel_seed_uvit_xlsr_tiny.yml)) | Voice Conversion (VC) | 22050 | XLSR-large | HIFT | 384 | 9 | 25M | suitable for real-time voice conversion |
43-
| v1.0 | seed-uvit-whisper-small-wavenet ([🤗](https://huggingface.co/Plachta/Seed-VC/blob/main/DiT_seed_v2_uvit_whisper_small_wavenet_bigvgan_pruned.pth)[📄](configs/presets/config_dit_mel_seed_uvit_whisper_small_wavenet.yml)) | Voice Conversion (VC) | 22050 | Whisper-small | BigVGAN | 512 | 13 | 98M | suitable for offline voice conversion |
44-
| v1.0 | seed-uvit-whisper-base ([🤗](https://huggingface.co/Plachta/Seed-VC/blob/main/DiT_seed_v2_uvit_whisper_base_f0_44k_bigvgan_pruned_ft_ema.pth)[📄](configs/presets/config_dit_mel_seed_uvit_whisper_base_f0_44k.yml)) | Singing Voice Conversion (SVC) | 44100 | Whisper-small | BigVGAN | 768 | 17 | 200M | strong zero-shot performance, singing voice conversion |
45-
| v2.0 | hubert-bsqvae-small ([🤗](https://huggingface.co/Plachta/Seed-VC/blob/main/v2)[📄](configs/v2/vc_wrapper.yaml)) | Voice & Accent Conversion (VC) | 22050 | [ASTRAL-Quantization](https://github.com/Plachtaa/ASTRAL-quantization) | BigVGAN | 512 | 13 | 67M | Best in suppressing source speaker traits |
40+
| Version | Name | Purpose | Sampling Rate | Content Encoder | Vocoder | Hidden Dim | N Layers | Params | Remarks |
41+
|---------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------|---------------|------------------------------------------------------------------------|---------|------------|----------|--------------------|--------------------------------------------------------|
42+
| v1.0 | seed-uvit-tat-xlsr-tiny ([🤗](https://huggingface.co/Plachta/Seed-VC/blob/main/DiT_uvit_tat_xlsr_ema.pth)[📄](configs/presets/config_dit_mel_seed_uvit_xlsr_tiny.yml)) | Voice Conversion (VC) | 22050 | XLSR-large | HIFT | 384 | 9 | 25M | suitable for real-time voice conversion |
43+
| v1.0 | seed-uvit-whisper-small-wavenet ([🤗](https://huggingface.co/Plachta/Seed-VC/blob/main/DiT_seed_v2_uvit_whisper_small_wavenet_bigvgan_pruned.pth)[📄](configs/presets/config_dit_mel_seed_uvit_whisper_small_wavenet.yml)) | Voice Conversion (VC) | 22050 | Whisper-small | BigVGAN | 512 | 13 | 98M | suitable for offline voice conversion |
44+
| v1.0 | seed-uvit-whisper-base ([🤗](https://huggingface.co/Plachta/Seed-VC/blob/main/DiT_seed_v2_uvit_whisper_base_f0_44k_bigvgan_pruned_ft_ema.pth)[📄](configs/presets/config_dit_mel_seed_uvit_whisper_base_f0_44k.yml)) | Singing Voice Conversion (SVC) | 44100 | Whisper-small | BigVGAN | 768 | 17 | 200M | strong zero-shot performance, singing voice conversion |
45+
| v2.0 | hubert-bsqvae-small ([🤗](https://huggingface.co/Plachta/Seed-VC/blob/main/v2)[📄](configs/v2/vc_wrapper.yaml)) | Voice & Accent Conversion (VC) | 22050 | [ASTRAL-Quantization](https://github.com/Plachtaa/ASTRAL-quantization) | BigVGAN | 512 | 13 | 67M(CFM) + 90M(AR) | Best in suppressing source speaker traits |
4646

4747
Checkpoints of the latest model release will be downloaded automatically when first run inference.
4848
If you are unable to access huggingface for network reason, try using mirror by adding `HF_ENDPOINT=https://hf-mirror.com` before every command.

0 commit comments

Comments
 (0)