We are seeking a highly skilled Lead DevOps Engineer for one of our clients who is in SaaS B2B space, catering to clients across the globe. The ideal candidate will architect and maintain a highly available, global infrastructure capable of handling high QPS systems with 99.99% uptime. The role requires expertise in managing deployments across multiple regions, ensuring fault-tolerant systems, and driving scalability for mission-critical applications.
Responsibilities
- Architect, manage, and scale Kubernetes clusters for high throughput and low latency across multiple global regions.
- Design and maintain Infrastructure as Code (IaC) to support a fault-tolerant, globally distributed architecture.
- Build and optimize CI/CD pipelines to ensure smooth, zero-downtime deployments.
- Ensure 99.99% availability for high QPS applications by implementing robust monitoring, incident management, and failover strategies.
- Manage multi-region deployments to enable low-latency, geo-redundant infrastructure.
- Collaborate with cross-functional teams to ensure security, scalability, and operational efficiency.
- Lead and mentor a high-performing DevOps team, fostering a culture of excellence and innovation.
Qualification
- 7–10 years of experience managing large-scale, high-availability systems.
- Proven expertise in Kubernetes administration, including multi-region deployments and scaling for high QPS.
- Deep experience with IaC tools like Terraform or CloudFormation.
- Hands-on with CI/CD pipelines for global, multi-region deployments.
- Strong understanding of cloud platforms (AWS, GCP, or Azure) and geo-redundant architecture.
- Proficient in Linux, scripting (Bash, Python), and troubleshooting large-scale distributed systems.
- Experience leading teams and solving complex, production-grade system challenges.
Skills: linux,gcp,azure,devops,kubernetes,bash,python,ci/cd pipelines,architecture,aws,infrastructure as code (iac)