DevOps Manager

Rakuten India • Full-time • Bengaluru, IN • 3w ago

Engineering Manager Devops

Job Summary:

We are looking for a passionate and experienced Engineering Manager to lead our engineering team in the Data Platform Department. You will be responsible for building and managing a high-performing engineering team, ensuring efficient deployment, operation, and scaling of our Conversational AI platform.

Responsibilities:

Technical Leadership:

Provide strong technical leadership to the engineering/DevOps team, ensuring the highest standards of excellence in development, infrastructure management, automation, and deployment processes.

Infrastructure:

Kubernetes,
Centos, Ubuntu
Hypervisor/BareMetal
Loadbalancers, Proxies, API-Gateway

Programming Languages : Python, Golang, Java

Team Leadership:

Lead, mentor, and inspire a team of engineers, fostering a collaborative and innovative work environment.
Set clear expectations, provide regular feedback, and support the professional development of team members.
Infrastructure Management:
Oversee the design and maintenance of scalable and resilient infrastructure to support Conversational AI systems.
Ensure high availability, reliability, and performance of the infrastructure.

Automation and Tooling:

Drive the automation of manual processes, emphasizing the use of tools to enhance efficiency in development and deployment pipelines. Evaluate, select, and implement cutting-edge DevOps tools to improve workflow automation.
Continuous Integration/Continuous Deployment (CI/CD):
Implement and manage CI/CD pipelines, working closely with development teams to optimize build and deployment processes. Foster a culture of continuous improvement, identifying opportunities to enhance the CI/CD pipeline.

Collaboration:

Collaborate effectively with cross-functional teams, including software engineers, QA, and product managers, to integrate DevOps practices throughout the development lifecycle
Cloud and On-Prem Experience:
Possess hands-on experience with both GCP and On-Prem environments.
Ensure seamless integration and collaboration between cloud and on-premises infrastructure.

SRE (Site Reliability Engineering):

Apply SRE principles to enhance system reliability, performance, and availability. Collaborate with the team to implement best practices for monitoring, incident response, and reliability engineering
Establish and maintain robust monitoring systems to proactively identify and address issues. Develop and implement incident response plans, participating in troubleshooting and resolution efforts.
Proactively troubleshoot and resolve production issues, minimizing downtime and ensuring platform stability.
10 - Metrics - Track and Analyze key DevOps metrics, identifying areas for improvement and optimizing performance.

Minimum Qualifications:

5+ years of hands-on experience with Linux.
5+ years of programming experience with at least 2 languages from the following(Java, Scala, Python, Bash)
3+ years working experience in Kubernetes
4+ years in managing Team / Leading team.
Hands on experience with DevOps tools and technology such as Jenkins, git and chef.

Qualifications:

Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
Minimum 12 years of experience in engineering, with at least 3 years in a leadership role
Proven experience building and managing high-performing DevOps teams.
Expertise in Kubernetes is a must.
Strong understanding of DevOps principles and practices, including CI/CD, infrastructure as code, containerization, and monitoring.
Experience with cloud platforms ( GCP) and containerization technologies (e.g., Docker, Kubernetes).
Excellent communication, collaboration, and interpersonal skills.
Additional experience in SRE and high availability architectures is a plus.
Ability to motivate and inspire team members to achieve ambitious goals.
Passion for innovation and continuous improvement.
Familiarity with Conversational AI technologies is a plus.