A web application that allows users to upload audio files and get transcriptions with timestamps and speaker identification.
- Upload audio files
- Automatic transcription using OpenAI's Whisper
- Display transcriptions with timestamps
- Modern and responsive UI
- Create a
.env
file in the project root with your Hugging Face token:
HUGGING_FACE_TOKEN=your_hugging_face_token
- Install Python dependencies:
pip install -r requirements.txt
- Run the application:
python app.py
- Open your browser and navigate to
http://localhost:5000
- Choose your method:
- Click "Choose File" to select an existing audio file
- Click "Start Recording" to record audio from your microphone
- Click "Upload and Transcribe"
- Wait for the processing to complete
- View the transcript with timestamps and speaker identification
- Customize speaker names if needed
- Click "Update Speaker Names" to apply changes
- M4A
- WAV
- MP3
- Any format supported by pydub