Machine Learning • Big Data • Scala • ETL I am learning to build reliable data/ML pipelines end to end — from ingestion and transformation to training, serving, and monitoring.
- 🔭 Current focus: Spark (Scala/Python), Airflow, Kafka, MLflow, feature stores
- 🌱 Learning: scalable model serving, data contracts, lakehouse patterns
- 🤝 Open-source: contributing docs, tests, and small features to data + ML tools
- Languages: Python, Scala, SQL
- Data/Compute: Apache Spark, Kafka, Airflow, Delta Lake, Hadoop
- ML: scikit-learn, PyTorch, MLflow
- Infra: Docker, GitHub Actions, AWS/GCP
📫 Reach me: https://www.linkedin.com/in/shubhangipanda/