📧 [email protected] | 🌍 Bangalore, India | 💼 LinkedIn
📝 Summary • 🎯 Current Role • 🛠️ Skills • 💼 Experience • 🎓 Education • 🌟 Achievements • 📞 Contact
🚀 Result-oriented Lead SRE Engineer with 16+ years of experience specializing in DevOps Release Engineering, Kubernetes (SRE), and Data Engineering across cloud platforms including Azure and GCP.
💡 Expert in building scalable data product deployment workflows, managing production Kubernetes clusters, and leading cross-functional teams with a proven track record of delivering 100+ projects and conducting technical sessions for diverse audiences.
🎯 Currently driving data mesh initiatives at H&M Group, focusing on BigQuery-based data platforms and GitHub pipeline automation for enhanced team productivity.
🏢 Lead SRE Engineer (AIAD – Data Stream) | H&M Group | Feb 2024 - Present
📍 Location: Bangalore, India
🔧 Technologies: BigQuery • GCP IAM • GitHub Actions • PowerBI • ServiceNow • SSAS • Data Mesh • dbt Cloud
- 🚀 Data Product Deployment: Building and assisting data product teams with scalable deployment workflows in cloud environments
- 🔄 GitHub Pipeline Integration: Supporting teams in onboarding reusable GitHub pipelines to enhance development velocity and standardization
- 📊 Data Mesh Architecture: Implementing BigQuery as the warehouse platform for Data Mesh setup, enabling decentralized data ownership
- 🔐 GCP IAM Management: Extensively worked on GCP IAM project setup with customer roles based on least access privilege policy and user onboarding
- 📈 Infrastructure Monitoring: Prepared PowerBI change tracking systems for infrastructure monitoring and prepared detailed ServiceNow dashboards for metrics reporting and incident management
- 🤖 Automation & Logging: Developed GitHub Actions-based SSAS jobs reporting and logging systems for environment monitoring
- 🎯 Successfully onboarded multiple data product teams to standardized deployment workflows
- 🔧 Implemented comprehensive IAM governance across GCP projects ensuring security compliance
- 📊 Established robust monitoring and incident management processes using ServiceNow and PowerBI
| ☁️ Cloud Platforms | 🐳 Container & Orchestration | 🔧 DevOps & CI/CD |
|---|---|---|
GCP Azure Vertex AI |
Kubernetes Docker Helm |
GitHub Actions Jenkins Terraform |
Azure Data Factory AKS GKE |
Kubectl Rancher ArgoCD |
Azure DevOps Ansible Git |
| 💻 Programming | 📊 Data & Analytics | 📈 Monitoring |
|---|---|---|
Python SQL |
BigQuery PowerBI MongoDB Tableau |
Splunk DataDog ServiceNow |
HiveQL Google App Scripts |
PowerBI dbt |
Solarwinds Monitoring Tools |
| AI/ML Tools and Skills |
|---|
MLflow kubeflow VertexAI Azure AI Foundry nltk(Basic) |
pandas • NumPy • Matplotlib • Seaborn • Django • sklearn • SciPy • Jupyter Notebook • requests
Postman • RStudio • pgAdmin • Linux (Ubuntu) • Alteryx • Excel PowerQuery • SonarQ • Snyk
🏢 Lead SRE Engineer (AIAD – Data Stream) | H&M Group | Feb 2024 - Present
📍 Location: Bangalore, India
🔧 Technologies: BigQuery • GCP IAM • GitHub Actions • PowerBI • ServiceNow • SSAS • Data Mesh • dbt Cloud
- 🚀 Data Product Deployment: Building and assisting data product teams with scalable deployment workflows in cloud environments
- 🔄 GitHub Pipeline Integration: Supporting teams in onboarding reusable GitHub pipelines to enhance development velocity and standardization
- 📊 Data Mesh Architecture: Implementing BigQuery as the warehouse platform for Data Mesh setup, enabling decentralized data ownership
- 🔐 GCP IAM Management: Extensively worked on GCP IAM project setup with customer roles based on least access privilege policy and user onboarding
- 📈 Infrastructure Monitoring: Prepared PowerBI change tracking systems for infrastructure monitoring and prepared detailed ServiceNow dashboards for metrics reporting and incident management
- 🤖 Automation & Logging: Developed GitHub Actions-based SSAS jobs reporting and logging systems for environment monitoring
- 🎯 Successfully onboarded multiple data product teams to standardized deployment workflows
- 🔧 Implemented comprehensive IAM governance across GCP projects ensuring security compliance
- 📊 Established robust monitoring and incident management processes using ServiceNow and PowerBI
🏢 Composable AI to Kubernetes Migration | Cognizant | Apr 2023 - Jan 2024
📍 Location: Bangalore, India (North Americas – Insurance Client)
🔧 Technologies: Azure Kubernetes Service (AKS) • Jenkins • TeamCity • SonarQ • Snyk • Docker • Microservices
- 🔄 ML Model Migration: Migrated 8 machine learning models from Composable Analytics environment and deployed as endpoints in Azure Kubernetes Service
- 🏗️ Microservice Architecture: Made code compatible with microservice architecture and prepared AKS manifests along with Build and Deployment pipelines
- 🤝 Stakeholder Management: Liaised with key business stakeholders and worked with various technical teams to accomplish MLOps deployment
- 🔒 Security Implementation: Addressed image vulnerabilities and packaged distroless images for enhanced security
- 🔧 CI/CD Pipeline Management: Leveraged Jenkins and TeamCity systems/pipelines available in the organization
- 🛡️ Code Quality Assurance: Performed code scanning using SonarQ and Snyk for vulnerability assessment
- 🎯 Successfully migrated 8 ML models to production-ready Kubernetes endpoints
- 🔐 Implemented security best practices with distroless container images
- 📈 Established robust CI/CD pipelines for MLOps deployment
🏢 MLOps Engineer | Cognizant | Nov 2022 - Mar 2023
📍 Location: Bangalore, India (Internal Accelerator Program)
🔧 Technologies: Azure Databricks • Azure DevOps • MLFlow • Machine Learning Pipelines
- 🤖 MLOps Pipeline Automation: Worked as an MLOps engineer in Azure Databricks using Azure DevOps to automate machine learning pipeline builds
- 🚀 Production Deployment: Automated machine learning pipeline deployment from development to production environments
- 📊 Model Management: Used MLFlow for comprehensive model tracking, registry management, and endpoint serving
- 🔧 Platform Integration: Gained expertise in DataBricks functionality and integration with Azure DevOps for pipeline management
- 🎯 Successfully automated ML pipeline builds reducing deployment time significantly
- 📈 Implemented comprehensive model tracking and registry system using MLFlow
- 🔄 Established end-to-end MLOps workflow from development to production
🏢 Build and Release Engineer Lead | TCS | Jan 2021 - Nov 2022
📍 Location: Bangalore, India (US - Professional Services)
🔧 Technologies: Azure DevOps • AKS • GKE • Rancher • Kubernetes • Helm • Monitoring Tools
- 🖥️ Infrastructure Management: Supervised 6 in-house agent pool servers to run Azure build jobs, managing dependencies, upgrades, and patching
- ☸️ Kubernetes Cluster Management: Maintained 8 AKS and GKE clusters, troubleshooting and fixing build and release pipeline issues
- 🔧 Rancher Administration: Performed maintenance of Rancher cluster serving as central point for all Kubernetes clusters in the environment
- 📊 Application Deployment: Installed and maintained various monitoring, telemetry, and stateful applications using Helm charts
- 🛠️ Pipeline Optimization: Managed upgrade processes and Kubernetes cluster maintenance for optimal performance
- 🎯 Successfully managed 8 production Kubernetes clusters with 99.9% uptime
- 🔧 Implemented centralized cluster management using Rancher reducing operational overhead
- 📈 Established comprehensive monitoring and telemetry solutions across all environments
🏢 Data Engineer | TCS | Sep 2019 - Dec 2020
📍 Location: Bangalore, India (PwC US - Professional Services)
🔧 Technologies: TIBCO Data Virtualization • HiveQL • Apache Zeppelin • Snaplogic • MongoDB • Tableau • PowerBI
- 👥 Scrum Team Leadership: Led a Scrum team adopting DevOps practices, managing and supporting the Data Platform/Lake for PwC US
- 🔧 Application Enhancement: Worked on bug fixes and enhancements to applications using TDV Studio, HiveQL, Apache Zeppelin, and Snaplogic
- 🐛 Issue Resolution: Analyzed and troubleshot issues in TIBCO Data Virtualization, Snaplogic, Apache HIVE, and MongoDB
- 📊 Analytics & Visualization: Used analytics tools like Tableau, Alteryx, Excel PowerQuery, and PowerBI for data troubleshooting and dashboard development
- 🚀 Release Management: Planned, scheduled, and implemented releases on a biweekly cycle
- 🎯 Successfully led DevOps adoption in Scrum team improving delivery efficiency
- 📈 Developed and published multiple Tableau dashboards meeting client requirements
- 🔄 Established biweekly release cycle ensuring consistent delivery cadence
🏢 Technical Lead Manager | TCS | Nov 2015 - Nov 2016 & Nov 2017 - Sep 2019
📍 Location: London, UK (PwC UK - Professional Services)
🔧 Technologies: Lotus Notes • Application Support • Project Management
- 👥 Team Leadership: Led a team of 10-12 members as part of Lotus Notes Support division
- 🤝 Client Coordination: Coordinated with clients on more than 80 mini maintenance projects across all technologies in the UK Application Support landscape
- 📋 Project Management: Managed multiple concurrent maintenance projects ensuring timely delivery and client satisfaction
- 🔧 Technical Support: Provided technical guidance and support across diverse technology stack
- 🎯 Successfully managed 80+ maintenance projects with high client satisfaction
- 👥 Led and mentored team of 10-12 technical professionals
- 🌍 Gained international experience working directly with UK-based clients
🏢 Team Lead | TCS | Nov 2014 - Nov 2015 & Nov 2016 - Nov 2017
📍 Location: Bangalore, India (PwC UK - Professional Services)
🔧 Technologies: Lotus Notes • Python Django • Application Support
- 👥 Team Mentoring: Mentored Lotus Notes team of 10 members, providing technical guidance and career development
- 🖥️ Application Maintenance: Supported and maintained over 600 applications including 10 Python Django applications
- 🔧 Technical Leadership: Provided technical expertise and problem-solving for complex application issues
- 📊 Performance Management: Managed team performance and ensured service level agreement compliance
- 🎯 Successfully maintained 600+ applications with minimal downtime
- 👥 Mentored 10 team members contributing to their professional growth
- 🔧 Managed diverse technology stack including Lotus Notes and Python Django applications
🏢 Technical Lead | TCS | Nov 2013 - Nov 2014
📍 Location: London, UK (PwC UK - Professional Services)
🔧 Technologies: Project Management • Incident Management • Stakeholder Management
- 🤝 Client Coordination: Coordinated with clients on support and maintenance projects ensuring clear communication and expectations
- 🚨 Incident Management: Managed major incidents, providing timely resolution and stakeholder communication
- 📊 Stakeholder Presentations: Presented project details to various stakeholders and obtained necessary approvals for project execution
- 🔧 Technical Leadership: Provided technical guidance for support and maintenance activities
- 🎯 Successfully managed major incidents with minimal business impact
- 🤝 Established strong client relationships through effective communication and project delivery
- 📈 Obtained stakeholder approvals for multiple critical projects
🏢 Application Support Specialist | TCS | Apr 2011 - Nov 2012
📍 Location: Bangalore, India (PwC UK - Professional Services)
🔧 Technologies: Lotus Notes • Application Support
- 🖥️ Application Support: Worked extensively on Lotus Notes, providing support to 400+ applications used across the organization
- 🔧 Issue Resolution: Troubleshot and resolved application issues ensuring minimal business disruption
- 📊 System Maintenance: Performed regular maintenance activities to ensure optimal application performance
- 📋 Documentation: Maintained comprehensive documentation for supported applications and processes
- 🎯 Successfully supported 400+ Lotus Notes applications with high availability
- 🔧 Developed expertise in Lotus Notes administration and troubleshooting
- 📈 Contributed to improved application stability and user satisfaction
🏢 Infrastructure Production Support | TCS | Sep 2008 - Mar 2011
📍 Location: Bangalore, India (Ameriprise Financial - US Banking)
🔧 Technologies: IBM Domino • Messaging Servers • Collaboration Platforms
- 🖥️ Production Support: Played key role in production support and maintained end-to-end IBM Domino Messaging Servers and collaboration platforms
- 🔧 Platform Management: Managed 6 collaboration tools ensuring high availability and performance
- 🛠️ System Administration: Performed system administration tasks including monitoring, maintenance, and troubleshooting
- 📊 Performance Optimization: Optimized messaging servers and collaboration platforms for enhanced performance
- 🎯 Maintained high availability of critical messaging and collaboration infrastructure
- 🔧 Gained foundational expertise in IBM Domino and messaging technologies
- 📈 Started career with strong foundation in production support and system administration
🎓 Bachelor of Engineering (B.E.) in Mechatronics - Thiagarajar College of Engineering, Madurai (2004-2008)
📊 CGPA: 8.55/10
🌩️ HashiCorp Terraform Associate - HashiCorp
☁️ Google Cloud Digital Leader - Google Cloud Platform
⚙️ KCNA: Kubernetes and Cloud Native Associate - The Linux Foundation (Expired June 2025)
🐳 Certified Kubernetes Administrator (Edureka)
🧠 MLOps | Machine Learning Operations Specialization - Duke University (Coursera)
📊 IBM Data Science - IBM (Coursera)
💾 Meta Database Engineer - Meta (Coursera)
🟢 Lean Six Sigma Green Belt - TCS (Process Improvement)
🔧 ITIL V3 Foundation - ITIL
- 🚀 100+ Projects Delivered: Successfully worked on maintenance projects and delivered more than 100 small projects across various technologies and platforms throughout career
- 🎤 10+ Technical Sessions: Conducted more than 10 technical sessions and tech talks to audiences of over 100 people on cutting-edge technologies including Python, Cloud Computing, Tableau, Apache Zeppelin, Snaplogic, and Azure DevOps
- 👥 Team Leadership Excellence: Led teams of up to 12 members maximum, managing scrum teams with strong DevOps culture while following strict ITIL processes for enterprise-grade delivery
- 🏆 Industry Recognition: Recognized as one of the earlier Contextual Masters in TCS for understanding both TCS and client processes, streamlining various work items for improved efficiency
- 🟢 Process Improvement Impact: Achieved Green Belt certification in Process Improvement and Cost Savings, contributing to organizational efficiency and cost optimization
- ☸️ Kubernetes Expertise: Successfully managed 8 production AKS and GKE clusters with 99.9% uptime while supervising 6 in-house agent pool servers
- 🔄 ML Model Migration: Migrated 8 machine learning models from legacy Composable Analytics environment to modern Azure Kubernetes Service endpoints
- 📊 Large-Scale Application Support: Supported and maintained over 600 applications including 10 Python Django applications across multiple technology stacks
- 🌍 International Experience: Gained valuable international experience working directly with UK-based clients on 80+ maintenance projects
- 🎯 Multi-Platform Expertise: Demonstrated expertise across diverse platforms including Kubernetes (AKS, GKE), Rancher, Azure DevOps, and various cloud technologies
- 🏗️ Architecture & Design: Expert in designing scalable cloud-native solutions, microservice architectures, and data mesh implementations across Azure and GCP platforms
- ☸️ Kubernetes & Container Orchestration: Advanced expertise in managing production AKS and GKE clusters, Rancher administration, and container deployment strategies with 99.9% uptime
- 🔄 DevOps & Release Engineering: Proficient in CI/CD pipeline automation, GitHub Actions, Azure DevOps, Jenkins, and TeamCity with experience delivering 100+ projects
- 📊 Data Engineering & Analytics: Skilled in BigQuery, data platform management, ETL processes, and analytics tools including Tableau, PowerBI, and Apache Zeppelin
- 🤖 MLOps & Machine Learning: Experienced in ML model deployment, Azure Databricks, MLFlow, and migrating ML models to production Kubernetes environments
- 🛡️ Security & Compliance: Expert in GCP IAM management, least privilege access policies, vulnerability assessment using SonarQ and Snyk, and ITIL process implementation
- 👥 Team Leadership & Management: Proven track record leading teams of up to 12 members, mentoring professionals, and managing cross-functional stakeholder relationships
- 🔧 Production Support & Troubleshooting: Extensive experience in production system maintenance, incident management, and supporting 600+ applications across diverse technology stacks
📧 Email: [email protected]
💼 LinkedIn: linkedin.com/in/karthick-elangovan-6a440715
📍 Current Location: Bangalore, India
🗣️ Languages: English • Tamil
💼 Current Role: Lead SRE Engineer (AIAD – Data Stream) at H&M Group
🎯 Specialization: DevOps Release Engineering, Kubernetes (SRE), Data Engineering, Platform Engineering
☁️ Cloud Expertise: Azure • GCP • 16+ Years Experience



