Job ID: Dev-Eng-Pun-1279
Location: Pune
We are seeking a DevOps / Cloud Support Engineer with strong L2/L3 production support experience to manage and troubleshoot cloud-native infrastructure running on AWS and Kubernetes. This role is ideal for someone who thrives in incident response, root cause analysis, and maintaining highly available distributed systems.
## Key Responsibilities
- Provide L2/L3 support for production Kubernetes clusters and AWS infrastructure
- Troubleshoot complex issues across networking, compute, storage, IAM, and containerized workloads
- Lead incident response, perform root cause analysis (RCA), and document post-mortems
- Monitor system health, respond to alerts, and ensure SLA/SLO compliance
- Manage and maintain Infrastructure as Code (Terraform)
- Develop automation and operational scripts using Bash, Python, or Go
- Support CI/CD pipelines and deployment workflows
- Collaborate with engineering teams to resolve systemic issues and improve reliability
- Participate in on-call rotation as needed
## Required Qualifications
- Hands-on experience supporting production Kubernetes environments
- Strong knowledge of AWS services (EKS, EC2, IAM, VPC, S3, RDS, CloudWatch, etc.)
- Experience with Terraform for infrastructure provisioning and maintenance
- Proficiency in Bash scripting and basic programming in Python and/or Go
- Strong Linux systems administration and networking fundamentals
- Proven troubleshooting experience in distributed/cloud-native environments
## Preferred Qualifications
- Experience in 24/7 production support environments
- Familiarity with monitoring tools (Prometheus, Grafana, ELK, Datadog, etc.)
- Knowledge of Docker and containerization best practices
- Understanding of cloud security and access control
- Some experience with Python
If you are passionate about solving complex infrastructure problems, improving system reliability, and supporting mission-critical cloud platforms, we’d love to hear from you.