Skip to content

🎙️ Drop-in replacement for paid transcription APIs. Self-hosted, GPU-powered, speaker diarization. Free forever: uvx murmurai

License

Notifications You must be signed in to change notification settings

namastexlabs/murmurai

Repository files navigation

MurmurAI Logo

MurmurAI

🎙️ Drop-in replacement for paid transcription APIs. Self-hosted, GPU-powered, speaker diarization. Free forever.

PyPI CI Python 3.12 MIT License

FeaturesQuick StartAPIConfigSecurityDevelopment


Turn any audio into text with speaker labels. No cloud. No limits. Just run:

uvx murmurai

MurmurAI wraps murmurai-core (our WhisperX fork) in a REST API with speaker diarization, word-level timestamps, and multiple export formats. Self-hosted alternative to AssemblyAI, Deepgram, and Rev.ai.

Features

  • Speaker Diarization - Identify who said what with pyannote
  • Word-Level Timestamps - Precise alignment for every word
  • Multiple Export Formats - SRT, WebVTT, TXT, JSON
  • Webhook Callbacks - Get notified when transcription completes
  • GPU Model Caching - Fast subsequent transcriptions
  • Background Processing - Non-blocking async jobs
  • Progress Tracking - Poll for real-time status

🔮 What's Next

We're a research lab. Stars tell us what the community wants — help us prioritize!

Star to unlock

🔒 250 ⭐ Desktop App   │   🔒 500 ⭐ MCP Server   │   🔒 750 ⭐ Native Apple Silicon (MLX)   │   🔒 1000 ⭐ Real-time Streaming

Quick Start

Prerequisites

  • NVIDIA GPU with 6GB+ VRAM (or CPU mode for testing)
  • CUDA 12.x drivers installed

Option A: One-Liner Install (Recommended)

curl -fsSL https://install.namastex.ai/get-murmurai.sh | bash

This installs Python 3.12, uv, checks CUDA, and sets up murmurai.

Option B: Direct Run (if dependencies met)

uvx murmurai

Option C: pip install

pip install murmurai
murmurai

Option D: Docker (GPU required)

# Clone and run with docker compose
git clone https://github.com/namastexlabs/murmurai.git
cd murmurai
docker compose up

Requires NVIDIA Container Toolkit. Set MURMURAI_API_KEY in environment for production.

Windows Install

Windows requires PyTorch with CUDA from PyTorch's index (PyPI only has CPU wheels for Windows).

# One command (auto-detects CUDA):
uv pip install murmurai --torch-backend=auto

# Or manually:
uv pip install torch torchaudio --index-url https://download.pytorch.org/whl/cu128
uv pip install murmurai

If you see "PyTorch is CPU-only", reinstall with --torch-backend=auto or use the manual method above.

The API starts at http://localhost:8880. Swagger docs at /docs.

First Transcription

# Default API key is "namastex888" - works out of the box
curl -X POST http://localhost:8880/v1/transcript \
  -H "Authorization: namastex888" \
  -F "[email protected]"

# Check status (replace {id} with returned transcript ID)
curl http://localhost:8880/v1/transcript/{id} \
  -H "Authorization: namastex888"

API Reference

Method Endpoint Description
POST /v1/transcript Submit transcription job
GET /v1/transcript/{id} Get transcript status/result
GET /v1/transcript/{id}/srt Export as SRT subtitles
GET /v1/transcript/{id}/vtt Export as WebVTT
GET /v1/transcript/{id}/txt Export as plain text
GET /v1/transcript/{id}/json Export as JSON
DELETE /v1/transcript/{id} Delete transcript
GET /health Health check (no auth)

Submit Transcription

File upload:

curl -X POST http://localhost:8880/v1/transcript \
  -H "Authorization: namastex888" \
  -F "[email protected]"

URL download:

curl -X POST http://localhost:8880/v1/transcript \
  -H "Authorization: namastex888" \
  -F "audio_url=https://example.com/audio.mp3"

With speaker diarization:

curl -X POST http://localhost:8880/v1/transcript \
  -H "Authorization: namastex888" \
  -F "[email protected]" \
  -F "speaker_labels=true" \
  -F "speakers_expected=2"

Response Format

{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "completed",
  "text": "Hello world, this is a transcription.",
  "words": [
    {"text": "Hello", "start": 0, "end": 500, "confidence": 0.98, "speaker": "A"}
  ],
  "utterances": [
    {"speaker": "A", "text": "Hello world...", "start": 0, "end": 3000}
  ],
  "language_code": "en"
}

Status values: queuedprocessingcompleted (or error)

Configuration

All settings via environment variables with MURMURAI_ prefix. Everything has sensible defaults - no .env file needed for local use.

Variable Default Description
MURMURAI_API_KEY namastex888 API authentication key
MURMURAI_HOST 0.0.0.0 Server bind address
MURMURAI_PORT 8880 Server port
MURMURAI_MODEL large-v3-turbo Whisper model
MURMURAI_DATA_DIR ./data SQLite database location
MURMURAI_HF_TOKEN - HuggingFace token (for diarization)
MURMURAI_DEVICE 0 GPU device index
MURMURAI_LOG_FORMAT text Logging format (text or json)
MURMURAI_LOG_LEVEL INFO Logging level (DEBUG, INFO, WARNING, ERROR)

Speaker Diarization Setup

To enable speaker_labels=true:

  1. Accept license at pyannote/speaker-diarization
  2. Get token at huggingface.co/settings/tokens
  3. Add to config:
    echo "MURMURAI_HF_TOKEN=hf_xxx" >> ~/.config/murmurai/.env

Security

Default API Key Warning

MurmurAI ships with a default API key (namastex888) for zero-config local use. This key is publicly known.

For any network-exposed deployment, set a secure key:

# Generate a secure random key
export MURMURAI_API_KEY=$(openssl rand -hex 32)

# Or add to your .env file
echo "MURMURAI_API_KEY=$(openssl rand -hex 32)" >> .env

The server will display a security warning at startup if using the default key.

Network Exposure

  • Local-only (default): Safe to use default key for localhost testing
  • LAN/Docker: Change the API key before exposing to your network
  • Internet: Always use a strong API key + consider a reverse proxy with HTTPS

SSRF Protection

The API validates all audio_url parameters to prevent Server-Side Request Forgery:

  • Blocks internal IPs (127.0.0.1, 10.x.x.x, 192.168.x.x, etc.)
  • Blocks cloud metadata endpoints (169.254.169.254)
  • Only allows HTTP/HTTPS schemes
  • Resolves DNS and validates the resolved IP

Troubleshooting

CUDA not available:

# Check NVIDIA driver
nvidia-smi

# Check PyTorch CUDA
python -c "import torch; print(torch.cuda.is_available())"

Out of VRAM:

  • Use smaller model: MURMURAI_MODEL=medium
  • Reduce batch size: MURMURAI_BATCH_SIZE=8

Diarization fails:

  • Verify HF token: echo $MURMURAI_HF_TOKEN
  • Accept license at HuggingFace (link above)

Built On

This project uses murmurai-core - our maintained fork of WhisperX with modern dependency support (PyTorch 2.6+, Pyannote 4.x, Python 3.10-3.13).


Development

Setup

git clone https://github.com/namastexlabs/murmurai.git
cd murmurai
uv sync

Run Tests

uv run pytest tests/ -v

Code Quality

uv run ruff check .
uv run ruff format .
uv run mypy src/

Project Structure

murmurai/
├── src/murmurai/
│   ├── server.py          # FastAPI application
│   ├── transcriber.py     # Transcription pipeline
│   ├── model_manager.py   # GPU model caching
│   ├── database.py        # SQLite persistence
│   ├── config.py          # Settings management
│   ├── auth.py            # API authentication
│   ├── models.py          # Pydantic schemas
│   ├── deps.py            # Dependency checks
│   └── main.py            # CLI entry point
├── tests/                 # Test suite
├── get-murmurai.sh        # One-liner installer
└── pyproject.toml         # Project config

CI/CD

  • CI: Runs on every push (lint, typecheck, test)

Performance Notes

  • First request: ~60-90s (model loading)
  • Subsequent: ~same as audio duration
  • VRAM usage: ~5-6GB for large-v3-turbo

Made with ❤️ by Namastex Labs

Star us on GitHub

About

🎙️ Drop-in replacement for paid transcription APIs. Self-hosted, GPU-powered, speaker diarization. Free forever: uvx murmurai

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published