Note: The name "datahead" is inspired by the band Radiohead - a playful nod to processing music data! 🎸
A FastAPI wrapper for a music embedding pipeline that can upload music files and find similar music using vector similarity search. Now with MCP (Model Context Protocol) support for LLM integration!
- 🎵 Music File Upload: Upload music files (MP3, WAV, FLAC, M4A, etc.) and automatically generate embeddings
- 🔍 Similarity Search: Find similar music using uploaded files or existing file IDs
- 🐳 Docker Support: Complete containerized setup with ChromaDB
- 📊 Vector Database: Uses ChromaDB for efficient similarity search
- 🎯 Audio Processing: Advanced audio segmentation and feature extraction
- 🔧 RESTful API: Clean REST API with automatic documentation
- 🤖 MCP Server: Model Context Protocol server for LLM integration
datahead/
├── src/ # Source code
│ ├── api/ # API components
│ │ ├── api.py # FastAPI application
│ │ ├── mcp_server.py # MCP server
│ │ └── mcp_config.json
│ ├── core/ # Core pipeline components
│ │ ├── audio_processor.py
│ │ ├── embedding_generator.py
│ │ ├── ingestion_pipeline.py
│ │ └── vector_store.py
│ ├── utils/ # Utilities and configuration
│ │ ├── config.py
│ │ └── start_api.py
│ ├── main.py # Main entry point
│ ├── requirements.txt
│ ├── Dockerfile
│ └── setup.py
├── scripts/ # Utility scripts
│ ├── main.py # Pipeline CLI
│ └── example_usage.py
├── tests/ # Test files
│ ├── test_pipeline.py
│ └── test_mcp.py
├── docker-compose.yml # Docker setup
├── start_docker.sh # Docker startup script
└── README.md # This file
- Docker and Docker Compose installed
- At least 4GB of available RAM
git clone <your-repo>
cd datahead
chmod +x start_docker.sh
./start_docker.sh
This will start:
- ChromaDB on port 8000
- Datahead on port 8080
- ChromaDB Web UI on port 3000
- API Documentation: http://localhost:8080/docs
- Health Check: http://localhost:8080/health
- ChromaDB Web UI: http://localhost:3000
- Python 3.11+
- FFmpeg installed
- ChromaDB (can be run via Docker)
cd src
pip install -r requirements.txt
docker-compose up chromadb -d
# Run FastAPI server
python src/main.py --mode api
# Run MCP server
python src/main.py --mode mcp
# Run pipeline directly
python src/main.py --mode pipeline
# FastAPI server
python src/utils/start_api.py
# MCP server
python src/api/mcp_server.py
# Pipeline CLI
python scripts/main.py
Datahead also includes an MCP server that allows LLMs to use music similarity search as tools. This enables natural language interactions with your music database.
upload_music_file
- Upload and process music filessearch_similar_music
- Find similar music using a query filesearch_by_file_id
- Search similar music using a database file IDget_file_info
- Get detailed information about a filelist_all_files
- List all files in the databasedelete_file
- Delete a file from the database
python src/main.py --mode mcp
Use the provided src/api/mcp_config.json
to configure your MCP client:
{
"mcpServers": {
"music-embedding": {
"command": "python",
"args": ["src/api/mcp_server.py"],
"env": {
"PYTHONPATH": "."
}
}
}
}
Example 1: Upload and Search
User: "I have a song called song.mp3, can you find similar music?"
LLM: I'll help you find similar music! Let me first upload your song and then search for similar pieces.
[Calls upload_music_file with song.mp3]
[Calls search_similar_music with the uploaded file]
Example 2: Get File Information
User: "What's the tempo of the song with ID song_segment_0?"
LLM: Let me get the detailed information about that song for you.
[Calls get_file_info with song_segment_0]
Example 3: Database Overview
User: "Show me all the music in the database"
LLM: I'll show you an overview of all the music files in the database.
[Calls list_all_files]
python tests/test_mcp.py
This will show you all available tools and example usage scenarios.
POST /upload
Content-Type: multipart/form-data
file: [music file]
Response:
{
"success": true,
"file_id": "song_segment_0",
"segments_created": 3,
"message": "Successfully processed song.mp3",
"metadata": {
"filename": "song.mp3",
"file_size": 5242880,
"duration": 180.5,
"segments": [
{
"segment_id": "song_segment_0",
"duration": 30.0,
"tempo": 120.5
}
]
}
}
POST /search?n_results=10
Content-Type: multipart/form-data
file: [query music file]
Response:
{
"success": true,
"query_file_id": "query.mp3",
"results": [
{
"id": "uuid-123",
"distance": 0.234,
"metadata": {
"filename": "similar_song.mp3",
"file_path": "/path/to/similar_song.mp3",
"segment_index": 0,
"segment_duration": 30.0,
"tempo": 118.2,
"num_beats": 60,
"file_duration": 180.5
}
}
],
"total_results": 10,
"message": "Found 10 similar music pieces"
}
GET /search/{file_id}?n_results=10
GET /files/{file_id}
GET /files
DELETE /files/{file_id}
The application can be configured via environment variables:
Variable | Default | Description |
---|---|---|
CHROMA_HOST |
localhost |
ChromaDB host |
CHROMA_PORT |
8000 |
ChromaDB port |
VECTOR_DB_PATH |
./vector_db |
Local vector database path |
MUSIC_FILES_PATH |
./music_files |
Music files directory |
docker-compose up -d
docker-compose logs -f music-api
docker-compose logs -f chromadb
docker-compose down
docker-compose up --build -d
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ FastAPI App │ │ ChromaDB │ │ Audio Files │
│ (Port 8080) │◄──►│ (Port 8000) │ │ │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│ │ │
│ │ │
▼ ▼ ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Upload │ │ Vector │ │ Audio │
│ Endpoint │ │ Storage │ │ Processing │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│ │ │
│ │ │
▼ ▼ ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Search │ │ Embedding │ │ Feature │
│ Endpoint │ │ Generation │ │ Extraction │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│ │ │
│ │ │
▼ ▼ ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ MCP Server │ │ LLM Tools │ │ Natural │
│ (stdio) │ │ Interface │ │ Language │
└─────────────────┘ └─────────────────┘ └─────────────────┘
- MP3 (.mp3)
- WAV (.wav)
- FLAC (.flac)
- M4A (.m4a)
- AAC (.aac)
- OGG (.ogg)
- WMA (.wma)
-
Port Already in Use
# Check what's using the port lsof -i :8080 # Kill the process or change the port in docker-compose.yml
-
ChromaDB Connection Issues
# Check if ChromaDB is running docker-compose ps # Restart ChromaDB docker-compose restart chromadb
-
Audio Processing Errors
- Ensure FFmpeg is installed in the Docker container
- Check audio file format support
- Verify file is not corrupted
-
MCP Server Issues
- Ensure MCP dependency is installed:
pip install mcp
- Check that the server is running:
python src/main.py --mode mcp
- Verify MCP client configuration
- Ensure MCP dependency is installed:
-
Import Errors
- Make sure you're running from the project root
- Check that all
__init__.py
files are present - Verify Python path includes the
src
directory
View detailed logs:
# All services
docker-compose logs -f
# Specific service
docker-compose logs -f music-api
docker-compose logs -f chromadb
- Edit
src/api/api.py
- Add new route handlers
- Update Pydantic models if needed
- Test with the interactive docs at
/docs
- Edit
src/api/mcp_server.py
- Add new tool definition in
list_tools()
- Add corresponding handler method
- Update test script
tests/test_mcp.py
- Edit
src/core/audio_processor.py
- Update feature extraction in
src/core/embedding_generator.py
- Test with sample audio files
- Update metadata structure in
src/core/ingestion_pipeline.py
- Consider migration strategy for existing data
- Test with sample data
This project is licensed under the MIT License - see the LICENSE file for details.
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests
- Submit a pull request