Skip to content

A comprehensive, unified dataset for Indian Classical Music research, combining data from multiple sources including Ramanarunachalam, Saraga, and Carnatic Varnam datasets.

License

Notifications You must be signed in to change notification settings

adhit-r/RagaSense-Data

Repository files navigation

RagaSense-Data

A comprehensive, unified dataset for Indian Classical Music research, combining data from multiple sources including Ramanarunachalam, Saraga, and Carnatic Varnam datasets.

🎵 Overview

RagaSense-Data is the largest unified dataset for Indian Classical Music, containing:

  • 1,340+ unique ragas from both Carnatic and Hindustani traditions
  • 100,000+ songs with metadata and cross-tradition mappings
  • Audio features extracted for machine learning applications
  • Cross-tradition mappings validated by musicological experts

📊 Dataset Statistics

  • Ramanarunachalam: 868 ragas, 105,339 songs
  • Saraga 1.5 Carnatic: 1,982 audio files, 498 metadata files
  • Saraga 1.5 Hindustani: Processing in progress
  • Saraga Melody Synth: 339 audio files
  • Carnatic Varnam: Processing in progress

🚀 Quick Start

Data Access

# Explore the dataset
python3 scripts/exploration/explore_ragasense_data.py

# Web interface
python3 scripts/exploration/web_explorer.py

ML Training

# Extract audio features
python3 scripts/data_processing/extract_audio_features.py

# Train raga detection model
python3 ml_models/training/gpu_optimized_trainer.py

Validation

# Run full validation (CPU, W&B off)
python3 tools/validation/data_validator.py

# Validate a specific tradition
python3 tools/validation/data_validator.py --tradition carnatic

# Validate mappings only
python3 tools/validation/data_validator.py --mappings

# Optional: customize base path and enable GPU/W&B
RAGASENSE_BASE_PATH=$PWD python3 tools/validation/data_validator.py --gpu --wandb

📁 Project Structure

RagaSense-Data/
├── data/                    # Main dataset
│   ├── raw/                # Original data sources
│   ├── processed/          # Cleaned and processed data
│   ├── ml_ready/          # ML training datasets
│   └── exports/           # Community export formats
├── scripts/               # Processing and analysis scripts
│   ├── data_processing/   # Data processing pipelines
│   ├── analysis/          # Data analysis tools
│   ├── exploration/       # Data exploration tools
│   ├── integration/       # Dataset integration
│   └── utilities/         # Utility scripts
├── ml_models/            # Machine learning models
├── tools/                # Development tools
├── docs/                 # Documentation
├── schemas/              # Database schemas
└── website/              # Web interface

🌐 Web Interface

Visit our live website: RagaSense-Data

📚 Documentation

🤝 Contributing

We welcome contributions! Please see our Contribution Guide for details.

📄 License

This dataset is released under [License Type] for research and educational purposes.

📞 Contact

For questions or collaboration, please contact [Contact Information].

🙏 Acknowledgments

  • Ramanarunachalam Music Repository
  • Saraga Dataset Team
  • Carnatic Varnam Dataset Contributors
  • Musicological experts who validated cross-tradition mappings

About

A comprehensive, unified dataset for Indian Classical Music research, combining data from multiple sources including Ramanarunachalam, Saraga, and Carnatic Varnam datasets.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •