Job Title: DevOps Operations Head - Kubernetes & Database Environments
Location: Gurgaon
Employment Type: Full-Time
Reports To: Director DevOps
Job Summary
We are seeking a highly skilled and experienced DevOps Operational head to oversee the 24x7 operations of our Kubernetes and database environments across 14 geographic regions. The ideal candidate will have a strong background in managing large-scale, distributed systems, ensuring high availability, scalability, and performance. The candidate will lead a global team of DevOps engineers, database administrators, and SREs to ensure seamless operations, incident management, and continuous improvement of our infrastructure.
- DevOps Strategy & Leadership
- Contribute to defining, shaping and executing the organization's DevOps vision, strategy, and best practices.
- Lead, mentor, and grow a team of DevOps engineers, DBAs and SREs.
- Collaborate with engineering, IT, developers, and security teams to ensure DevOps practices align with business goals.
- DevOps Operational Management
- Oversee the 24x7 Devops operation - Kubernetes clusters, databases & tools environments across 14 geographic regions.
- Ensure high availability, scalability, and performance of production systems.
- Implement and maintain monitoring, alerting, and incident response processes to minimize downtime.
- Manage on-call rotations and ensure timely resolution of incidents.
- Infrastructure Management and Optimization
- Manage the devops infrastructure on premises - ensuring scalability, reliability, security, and cost optimization.
- Ensure high availability, disaster recovery, and failover strategies are in place and working correctly all the time by conducting regular disaster recovery drills and ensure readiness for failover scenarios
- Implement automation to streamline deployment, scaling, and management processes.
- Collaborate with development teams to ensure infrastructure meets application requirements.
- CI/CD & Automation
- Develop, implement, and maintain CI/CD pipelines for fast and reliable software releases.
- Drive automation of infrastructure provisioning, configuration management, and deployment processes.
- Improve software delivery speed and reliability by enforcing DevOps best practices.
- Security & Compliance
- Ensure DevOps processes follow security best practices for full compliance, including vulnerability scanning, compliance checks, and incident response
- Implement and manage access controls, encryption, and vulnerability management processes.
- Collaborate with security teams to implement DevSecOps strategies.
- Maintain compliance with industry standards such as ISO 27001, SOC2, or HIPAA.
- Vendor & Tool Management
- Evaluate and manage third-party tools and services for Kubernetes and database operations.
- Maintain relationships with cloud providers and other vendors.
- Monitoring, Observability & Performance
- Implement and optimize monitoring, logging, and alerting solutions (e.g., Prometheus, Grafana, ELK stack, Datadog).
- Ensure system reliability, uptime, and performance through proactive observability practices.
- Establish and maintain SLAs, SLOs, and error budgets to measure service performance.
- Collaboration & Stakeholder Management
- Work closely with software development, IT operations, and security teams to streamline development and deployment workflows.
- Engage with business stakeholders to align DevOps practices with company objectives.
- Provide executive-level reports on DevOps performance, key metrics, and improvement initiatives.
- Reporting & Metrics
- Provide regular reports on system performance, uptime, and incident trends to senior leadership.
- Define and track key performance indicators (KPIs) for operational excellence.
Key Qualifications
- Bachelor’s degree in computer science, Information Technology, or a related field (or equivalent experience).
- 10+ years of experience in DevOps, SRE, or infrastructure management.
- 3+ years of experience managing Kubernetes clusters in production environments.
- 3+ years of experience managing distributed database systems (e.g., PostgreSQL, MySQL, MongoDB, Cassandra)
- Proficiency in infrastructure-as-code tools (e.g., Terraform, Ansible, Helm).
- Experience with monitoring and observability tools (e.g., Prometheus, Grafana, Datadog).
- Strong leadership and team management skills.
- Excellent problem-solving and communication skills.
Preferred Skills
- Master’s degree in a related field.
- Certifications in Kubernetes (e.g., CKA, CKAD) or cloud platforms (e.g., AWS Certified DevOps Engineer).
- Experience managing multi-region, multi-cloud environments.
- Knowledge of database performance tuning and optimization.
- Familiarity with CI/CD pipelines and GitOps practices.
Key Competencies
- Strategic thinking and ability to align operational goals with business objectives.
- Strong organizational and time management skills.
- Ability to work under pressure and manage multiple priorities.
Collaborative mindset with a focus on building strong cross-functional relationships