A pure Python web application for transcribing audio and video files using Whisper.
- Upload audio/video files (MP3, WAV, M4A, MP4, MOV, etc.)
- Transcribe content using OpenAI's Whisper API
- Copy or download transcription results
- Clean, responsive UI
- No JavaScript dependencies - pure server-side rendering
The app supports the following audio and video formats:
- Audio: .mp3, .wav, .ogg, .m4a, .flac, .aac
- Video: .mp4, .mov, .avi, .mkv, .webm, .mpeg, .mpg
- Backend: FastAPI + Uvicorn
- Frontend: HTML + CSS (No JavaScript)
- API: OpenAI Whisper API
- Templates: Jinja2
- Python 3.11+
- OpenAI API key
-
Clone the repository:
git clone https://github.com/yourusername/whisper-transcription-app.git cd whisper-transcription-app
-
Install dependencies:
python3 -m pip install -r requirements.txt
-
Set your OpenAI API key:
export OPENAI_API_KEY=your_api_key_here
-
Run the application:
python3 -m uvicorn app:app --reload
-
Open your browser and go to:
http://localhost:8000
You can also run the application with Docker:
docker build -t whisper-transcription-app .
docker run -p 8000:8000 -e OPENAI_API_KEY=your_api_key_here whisper-transcription-app
- Open the application in your web browser
- Select an audio or video file using the upload form
- Click "Transcribe" to start the transcription process
- Once complete, view, copy, or download the transcription result