Cloud - Operations Manager/Lead (Multi‑Cloud & SRE)
Location: Mumbai / Hybrid
Experience: 10–15+ Years
Function: Cloud Platforms | Infrastructure | SRE | Fin-Operations
Role Overview
We are seeking a Head of Cloud Operations to lead the end‑to‑end operational excellence of our shared cloud services estate spanning Azure, AWS, GCP, and hybrid environments.
This is a senior leadership role responsible for building and running secure, reliable, highly available, and cost‑optimized cloud platforms at scale. The role combines deep hands‑on cloud expertise, strong operational leadership, and the ability to define modern cloud operating models aligned to business growth, digital transformation, and regulatory requirements.
You will own 24x7 cloud operations, drive SRE and automation maturity, partner closely with Architecture, Security, DevOps, and Application teams, and ensure cloud services consistently meet SLAs, reliability targets, and compliance standards.
Key Responsibilities
1. Cloud Operations Leadership
- Own and lead end‑to‑end cloud operations across Azure, AWS, GCP, and hybrid infrastructure.
- Ensure high availability, performance, scalability, and security of all shared cloud platforms.
- Establish and mature 24x7 operations, including monitoring, incident response, problem management, and change management.
- Define and enforce cloud operational SLAs, OLAs, and KPIs.
2. Reliability, Resilience & SRE
- Implement and scale Site Reliability Engineering (SRE) practices, including:
- Error budgets
- SLIs, SLOs, and SLAs
- Blameless post‑mortems
- Drive observability standards using tools such as CloudWatch, Azure Monitor, Datadog, Grafana, Dynatrace, and App Insights.
- Continuously improve uptime, fault tolerance, failover, backup, and disaster recovery capabilities.
- Lead resilience testing, DR drills, and chaos engineering initiatives where applicable.
3. Automation, IaC & Platform Maturity
- Drive automation‑first operations using Infrastructure as Code (IaC) and policy‑as‑code.
- Standardize provisioning, configuration, and lifecycle management using:
- Terraform, ARM, CloudFormation
- Partner with DevOps teams to ensure CI/CD pipelines are reliable, secure, and scalable.
- Reduce operational toil through self‑service platforms and automation.
4. Cloud Governance, Security & Compliance
- Define and enforce cloud governance frameworks, including:
- Resource standards
- Tagging strategies
- Guardrails and policies
- Ensure compliance with ISO 27001, SOC 2, PCI‑DSS, and applicable regulatory requirements.
- Partner closely with Security, Risk, and Compliance teams to maintain a strong cloud security posture.
- Oversee cloud security tooling including CSPM, CWPP, SIEM, SOC integrations.
5. Cloud Financial Management (FinOps)
- Own and lead cloud cost management and optimization across all platforms.
- Implement FinOps best practices:
- Budgeting and forecasting
- Cost allocation and chargeback/showback
- Reserved instances, savings plans, and consumption optimization
- Provide regular cost visibility and insights to senior leadership.
- Balance cost optimization with performance, scalability, and reliability.
6. Team Leadership & Vendor Management
- Build, lead, and mentor high‑performing cloud operations engineers, SREs, and platform teams.
- Establish clear career paths, skills development, and succession planning.
- Manage MSPs, cloud service providers, and strategic vendors.
- Negotiate and govern SLAs, contracts, and service quality.
7. Deployment, Lifecycle & Service Management
- Oversee:
- Deployment pipelines
- Patch and vulnerability management
- Scaling and capacity planning
- Backup and disaster recovery
- Ensure standardization and lifecycle governance of all cloud resources.
- Embed ITIL practices for Incident, Problem, Change, and Service Continuity.
8. Architecture & Transformation Partnership
- Partner with Solution Architecture, DevOps, Security, and Application teams to ensure seamless handoffs from build to run.
- Contribute to:
- Cloud roadmap and strategy
- Modernization and migration programs
- Platform evolution and future‑state operating models
- Act as a trusted advisor to business and technology leadership.
Required Skills & Experience
Experience
- 10–15+ years of experience in cloud, infrastructure, or platform operations, with leadership responsibility.
- Proven experience running large‑scale, enterprise, 24x7 cloud operations.
- Hands‑on leadership across multi‑cloud and hybrid environments.
Technical Expertise
- Deep expertise in Azure, AWS, and/or GCP (minimum two platforms preferred).
- Strong experience with:
- Containers & orchestration (Kubernetes)
- Virtualization & networking
- Infrastructure as Code (Terraform, ARM, CloudFormation)
- CI/CD (Azure DevOps, GitHub Actions, Jenkins)
- Observability & monitoring tools (Dynatrace, Splunk, App Insights, Grafana)
- Cloud security tooling (CSPM, CWPP, SIEM)
Interested? Share your application on atharvakadam@orientindia.net along with your Current CTC, Exp CTC, Notice Period Reason for Change & Availability for Interview.