Role Summary
We are seeking a hands-on DevOps leader to drive cloud architecture, infrastructure operations, and modern engineering practices while managing a small team. This role combines technical leadership with people management, partnering closely with senior leadership to shape strategy, improve system reliability, and enhance developer productivity through automation and AI-assisted development.
Key Responsibilities
- Lead and mentor a DevOps team, setting technical direction and fostering a culture of reliability and continuous improvement.
- Architect and manage scalable AWS infrastructure using tools like Terraform, CloudFormation, or CDK.
- Build and optimize CI/CD pipelines and containerized environments (Docker, Kubernetes/EKS).
- Oversee cloud security, networking, observability (Datadog), and incident response processes.
- Drive cost optimization, disaster recovery planning, and infrastructure reliability initiatives.
- Develop automation, internal tooling, and documentation to improve operational efficiency and developer experience.
- Collaborate cross-functionally to influence architecture decisions and establish DevOps best practices at scale.
Requirements
- 5+ years in a senior or lead DevOps/SRE role with experience mentoring or managing engineers.
- Strong expertise in AWS (EC2, EKS/ECS, Lambda, RDS, DynamoDB, S3, CloudFront, IAM, VPC) and infrastructure as code (Terraform preferred).
- Solid understanding of networking (TCP/IP, DNS, routing, load balancing, VPNs) and cloud security best practices (IAM, encryption, secrets management).
- Hands-on experience with Kubernetes (EKS), Docker, and CI/CD pipelines (GitHub Actions).
- Proficiency in scripting or programming (Python, Go, or Bash).
- Experience with observability tools like Datadog (metrics, logs, tracing, alerting).
- Ability to lead complex initiatives, drive architecture decisions, and collaborate cross-functionally.
- Familiarity with AI-assisted development tools and modern DevOps practices.