ChatterboxToolkitUI 🎙️🧠

ChatterboxToolkitUI is a comprehensive web application built with Gradio that provides a user-friendly interface for advanced audio generation and workflow management using resemble-ai's Chatterbox model.

Key Features

Centralized Project Management: Create, select, and manage dedicated workspaces. All inputs, processed files, and generated outputs are automatically organized into a clean folder structure within the active project or a base directory.
Single Generation:
- Text-to-Speech (TTS): Generate high-quality speech from text using a reference audio file to clone the speaker's voice.
- Voice Conversion (VC): Transfer the vocal characteristics of a reference speaker onto a source audio recording.
- Parameter Sweeping: Generate multiple variations of a single output at once by sweeping across a range of values for any generation parameter (e.g., Temperature, Pace, etc.).
Batch Processing:
- Process entire folders of prepared text or audio files in bulk with a single click.
- Optionally concatenate all generated files from a batch run into a single, continuous audio file.
Data Preparation Suite:
- Text Splitter: Automatically chunk long text documents into smaller, model-friendly segments based on sentence boundaries and a configurable character limit.
- Audio Splitter: Intelligently split long audio recordings into shorter clips by making cuts during periods of silence, with configurable duration and silence detection parameters.
Workflow-Integrated Editing & Refinement:
- Regenerate Audio: The "Regenerate" workflow allows you to review individual audio files from a batch run, send them back to the Single TTS tab with their original text and voice pre-loaded, tweak parameters, and replace the old file with the new-and-improved version.
- Live Text Editor: Directly edit the content of your processed text files within the UI and save the changes, perfect for making small script adjustments without leaving the application.

[Full Feature Rundown Video]

Prerequisites

Before you begin, ensure you have the following installed on your system:

Python: Version 3.11 is required.
Git: For cloning the repository.
CUDA Compatible GPU: For acceptable performance, a GPU is highly recommended. The underlying models will be extremely slow on a CPU.
FFmpeg: This is a critical dependency for performing various audio processing tasks.

Installation Instructions

Follow these steps to set up and run the ChatterboxToolkitUI on your local machine or use the ChatterboxToolkitUI.ipynb to run it in a colab environment.

1. Clone the Repository

Open your terminal or command prompt and clone the repository.

git clone https://github.com/dasjoms/ChatterboxToolkitUI.git
cd ChatterboxToolkitUI

2. Set Up a Python Virtual Environment

Create a virtual environment using python 3.11 to avoid dependency conflicts.

py -3.11 -m venv toolkit

Activate the virtual environment.

.\toolkit\Scripts\activate

3. Install the Project and Dependencies

Users with 10 series NVidia cards or AMD GPUs need to manually install the proper torch 2.6.0 versions. Otherwise just install from requirements.txt

pip install -r requirements.txt

Running the Application

With your virtual environment still active, run the script:

python ChatterboxToolkitUI.py

Once running, you will see output in your terminal like this:

* Running on local URL:  http://127.0.0.1:7860

Open the local URL in your web browser to use the application.

A Typical Workflow

Create a Project: Navigate to the "Projects" tab and create a new project.
Prepare Data:
- Upload a long text file to the project's input_files folder using the "Project Utilities" uploader or manually move it there.
- Go to the "Data Preparation" tab to split the text into manageable chunks. The outputs will be saved to the processed_text/ folders.
Generate Audio:
- Go to the "Batch Generation" tab.
- Load your processed files from the project.
- Select a reference voice and your desired parameters.
- Run the batch generation.
Review & Refine:
- Go to the "Edit Project Data" tab.
- Use the "Regenerate" sub-tab to listen to your outputs. If one is imperfect, send it to the Single TTS tab, tweak the parameters, and replace it.
- Use the "Edit Text" sub-tab to fix any pronounciation issues you find in your source text chunks.
- Concatenate all audio files into one.

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
src/chatterbox		src/chatterbox
.gitignore		.gitignore
ChatterboxToolkitUI.ipynb		ChatterboxToolkitUI.ipynb
ChatterboxToolkitUI.py		ChatterboxToolkitUI.py
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ChatterboxToolkitUI 🎙️🧠

Key Features

Prerequisites

Installation Instructions

1. Clone the Repository

2. Set Up a Python Virtual Environment

3. Install the Project and Dependencies

Running the Application

A Typical Workflow

About

Uh oh!

Releases

Packages

Languages

License

dasjoms/ChatterboxToolkitUI

Folders and files

Latest commit

History

Repository files navigation

ChatterboxToolkitUI 🎙️🧠

Key Features

Prerequisites

Installation Instructions

1. Clone the Repository

2. Set Up a Python Virtual Environment

3. Install the Project and Dependencies

Running the Application

A Typical Workflow

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages