Whisper ASR OpenAI API

Whisper ASR for daily dialogue with standardized OpenAI API speech interface. Whisper ASR is a general-purpose speech recognition toolkit. Whisper Models are trained on a large dataset of diverse audio and is also a multitask model that can perform multilingual speech recognition as well as speech translation and language identification.

Quick Usage

Docker

Build by yourself

docker-compose up -d

Run with image

CPU

docker run -d -p 9000:9000 \
  -e ASR_MODEL=base \
  -e ASR_ENGINE=openai_whisper \
  onerahmet/openai-whisper-asr-webservice:latest

GPU

docker run -d --gpus all -p 9000:9000 \
  -e ASR_MODEL=base \
  -e ASR_ENGINE=openai_whisper \
  onerahmet/openai-whisper-asr-webservice:latest-gpu

Cache

To reduce container startup time by avoiding repeated downloads, you can persist the cache directory:

docker run -d -p 9000:9000 \
  -v $PWD/cache:/root/.cache/ \
  onerahmet/openai-whisper-asr-webservice:latest

Test

Download english speech media sample

wget -O test.map3 https://www.cambridgeenglish.org/images/153149-movers-sample-listening-test-vol2.mp3

Test Asr Api by media sample

curl http://172.25.0.2:9000/audio/transcriptions \
  -H "Content-Type: multipart/form-data" \
  -F '[email protected]' \
  -F model="whisper-1"

Key Features

Multiple ASR engines support (OpenAI Whisper, Faster Whisper, WhisperX)
Multiple output formats (text, JSON, VTT, SRT, TSV)
Word-level timestamps support
Voice activity detection (VAD) filtering
Speaker diarization (with WhisperX)
FFmpeg integration for broad audio/video format support
GPU acceleration support
Configurable model loading/unloading
REST API with Swagger documentation

Environment Variables

Key configuration options:

ASR_ENGINE: Engine selection (openai_whisper, faster_whisper, whisperx)
ASR_MODEL: Model selection (tiny, base, small, medium, large-v3, etc.)
ASR_MODEL_PATH: Custom path to store/load models
ASR_DEVICE: Device selection (cuda, cpu)
MODEL_IDLE_TIMEOUT: Timeout for model unloading

Development

# Install uv
curl -LsSf https://astral.sh/uv/install.sh | sh

# Install dependencies
uv sync --upgrade

# Export ffmpeg
export PATH="${PATH}:${PWD}/bin"

# Run service
uv run webservice.py --host 0.0.0.0 --port 9000

After starting the service, visit http://localhost:9000 or http://0.0.0.0:9000 in your browser to access the Swagger UI documentation and try out the API endpoints.

Credits

This software uses libraries from the FFmpeg project under the LGPLv2.1

Name		Name	Last commit message	Last commit date
Latest commit History 294 Commits
.github		.github
app		app
bin		bin
.dockerignore		.dockerignore
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
Dockerfile		Dockerfile
Dockerfile.gpu		Dockerfile.gpu
LICENCE		LICENCE
README.md		README.md
docker-compose.gpu.yml		docker-compose.gpu.yml
docker-compose.yml		docker-compose.yml
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml
uv.lock		uv.lock
webservice.py		webservice.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Whisper ASR OpenAI API

Quick Usage

Docker

Build by yourself

Run with image

Cache

Test

Key Features

Environment Variables

Development

Credits

About

Uh oh!

Releases

Packages

Languages

License

RavenMuse/whisper-openapi

Folders and files

Latest commit

History

Repository files navigation

Whisper ASR OpenAI API

Quick Usage

Docker

Build by yourself

Run with image

Cache

Test

Key Features

Environment Variables

Development

Credits

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages