Job Summary:
We are seeking a highly skilled and motivated DevOps Engineer to join our team at Rakuten India. The ideal candidate will have expertise in Kubernetes, cloud platforms, and a strong background in automating, managing, and scaling infrastructure and deployment pipelines. As a DevOps Engineer, you will play a crucial role in designing, implementing, and maintaining our DevOps processes to ensure the reliability, scalability, and security of our cloud-based applications and services. You will also contribute to leveraging AI/ML capabilities to enhance operational intelligence and automation within our DevOps practices.
Responsibilities:
- Design, implement, and maintain highly available, scalable, and secure cloud infrastructure using Kubernetes and other cloud-native technologies.
- Develop and maintain CI/CD pipelines for automating the deployment, testing, and monitoring of applications and microservices.
- Collaborate with software development teams to optimize application performance, troubleshoot issues, and ensure seamless integration of new features.
- Implement best practices for infrastructure as code (IaC) using tools such as Terraform, Ansible, or similar technologies.
- Contribute to the implementation and optimization of AIOps solutions for predictive analytics, anomaly detection, and intelligent incident management.
- Implement security best practices and compliance standards across the infrastructure and deployment pipelines.
- Research and experiment with AI/ML techniques to automate operational tasks, optimize resource allocation, and improve system resilience.
- Provide technical guidance and mentorship to junior members of the DevOps team.
- Work with data scientists or ML engineers to integrate AI models into operational workflows for automated insights and decision-making.
Qualifications:
- Bachelor's degree in Computer Science, Engineering, or a related field. Master's degree is a plus.
- Minimum of 3 years of experience in DevOps or related roles, with a focus on cloud infrastructure and Kubernetes.
- Proficiency in cloud platforms such as AWS, GCP, or Azure. Certification is a plus.
- Strong experience with containerization technologies, especially Docker and Kubernetes.
- Solid understanding of CI/CD principles and experience with Jenkins, GitLab CI, or similar tools.
- Hands-on experience with infrastructure as code (IaC) using Terraform, Ansible, or similar tools.
- Familiarity with monitoring and logging tools such as Prometheus, Grafana, ELK Stack, or similar technologies.
- Exposure to or interest in AIOps platforms or tools for log analytics, anomaly detection (e.g., Splunk, Dynatrace, Datadog's AI features).
- Proficiency in Python is highly desirable, especially for data manipulation and integration with AI/ML tools.
- Strong problem-solving skills and ability to troubleshoot complex issues in a distributed environment.
- Excellent communication skills and ability to collaborate effectively with cross-functional teams.