We are seeking a
DevOps Engineer to manage our high-availability AWS environment and our on-premises Kubernetes infrastructure. You will be the primary engineer of various issues from our backlog. This role is ideal for someone who enjoys the flexibility of clouds but also understands the grit required to manage bare-metal or virtualized on-prem systems.
Key Responsibilities
- AWS Infrastructure & Automated Operations
- Scale Patching: Design and implement automated patching for a fleet of mutable EC2 instances using AWS Systems Manager (SSM) Patch Manager.
- Infrastructure as Code: Maintain and modularize our cloud footprint using Terraform, ensuring all AWS resources follow the principle of least privilege.
- EKS : Manage our various EKS clusters globally
- Automate various jobs using python and terraform.
- On-Premises Kubernetes Management & Setup
- Cluster Orchestration: Lead the setup and lifecycle management of on-premises Kubernetes clusters using Nutanix or VMware.
- Storage (CSI): Configure and manage persistent storage for stateful workloads using Rook-Ceph or Longhorn, ensuring high data durability.
- Backup restore cluster and PVCs as needed
- Python
- Leverage python to develop and upgrade various existing scripts and tools we use for day to day operations
Technical Requirements
- Experience: 4+ years in DevOps, specifically managing hybrid environments.
- AWS Tools: Deep proficiency in SSM (Patch Manager, Run Command, Inventory), EC2, S3, IAM, and VPC. MUST have implemented end to end hub and spoke deployments
- Kubernetes: Hands-on experience with on-prem setup (not just EKS/GKE). Understanding of etcd management and control plane hardening.
- CI/CD: Expert in GitLab CI or GitHub Actions, specifically for deploying to both cloud and on-prem targets.
- Security: Strong understanding of DevSecOps—specifically vulnerability scanning and automated compliance.
- Coding/Scripting : leverage python for various development and scripting requirements