Whisper ASR for daily dialogue with standardized OpenAI API speech interface. Whisper ASR is a general-purpose speech recognition toolkit. Whisper Models are trained on a large dataset of diverse audio and is also a multitask model that can perform multilingual speech recognition as well as speech translation and language identification.
docker-compose up -d
- CPU
docker run -d -p 9000:9000 \
-e ASR_MODEL=base \
-e ASR_ENGINE=openai_whisper \
onerahmet/openai-whisper-asr-webservice:latest
- GPU
docker run -d --gpus all -p 9000:9000 \
-e ASR_MODEL=base \
-e ASR_ENGINE=openai_whisper \
onerahmet/openai-whisper-asr-webservice:latest-gpu
To reduce container startup time by avoiding repeated downloads, you can persist the cache directory:
docker run -d -p 9000:9000 \
-v $PWD/cache:/root/.cache/ \
onerahmet/openai-whisper-asr-webservice:latest
- Download english speech media sample
wget -O test.map3 https://www.cambridgeenglish.org/images/153149-movers-sample-listening-test-vol2.mp3
- Test Asr Api by media sample
curl http://172.25.0.2:9000/audio/transcriptions \
-H "Content-Type: multipart/form-data" \
-F '[email protected]' \
-F model="whisper-1"
- Multiple ASR engines support (OpenAI Whisper, Faster Whisper, WhisperX)
- Multiple output formats (text, JSON, VTT, SRT, TSV)
- Word-level timestamps support
- Voice activity detection (VAD) filtering
- Speaker diarization (with WhisperX)
- FFmpeg integration for broad audio/video format support
- GPU acceleration support
- Configurable model loading/unloading
- REST API with Swagger documentation
Key configuration options:
ASR_ENGINE
: Engine selection (openai_whisper, faster_whisper, whisperx)ASR_MODEL
: Model selection (tiny, base, small, medium, large-v3, etc.)ASR_MODEL_PATH
: Custom path to store/load modelsASR_DEVICE
: Device selection (cuda, cpu)MODEL_IDLE_TIMEOUT
: Timeout for model unloading
# Install uv
curl -LsSf https://astral.sh/uv/install.sh | sh
# Install dependencies
uv sync --upgrade
# Export ffmpeg
export PATH="${PATH}:${PWD}/bin"
# Run service
uv run webservice.py --host 0.0.0.0 --port 9000
After starting the service, visit http://localhost:9000
or http://0.0.0.0:9000
in your browser to access the Swagger UI documentation and try out the API endpoints.