We are seeking an experienced DevOps Engineer who thrives in a fast-paced, cloud-native environment. You will play a critical role in designing, implementing, and maintaining our cloud infrastructure using Infrastructure as Code (IaC) principles, ensuring scalability, security, and reliability. This role demands strong expertise in Kubernetes, Python, AWS, and Terraform to drive automation, CI/CD, and cloud infrastructure optimization.
Requirements
Key Responsibilities
Infrastructure as Code (IaC): Design and implement infrastructure automation using Terraform to manage cloud environments efficiently
Kubernetes Management: Deploy, scale, and manage containerized applications on Kubernetes clusters, ensuring high availability and performance
Cloud Operations: Build and maintain scalable, secure, and cost-efficient cloud solutions on AWS, leveraging best practices for networking, security, and storage
Automation & Scripting: Develop automation scripts using Python to streamline infrastructure management, monitoring, and deployments
CI/CD Pipelines: Implement and optimize CI/CD pipelines to enable seamless application deployment and integration
Monitoring & Logging: Set up monitoring, logging, and alerting solutions using tools like Prometheus, Grafana, ELK stack, or CloudWatch to ensure system reliability
Security & Compliance: Implement security best practices across infrastructure, ensuring compliance with industry standards
Collaboration: Work closely with software engineering teams to optimize deployments and support development workflows
Key Requirements
5+ years of experience in DevOps, Site Reliability Engineering (SRE), or Cloud Infrastructure roles
Strong expertise in Terraform for IaC, including provisioning and managing cloud infrastructure
Hands-on experience managing and scaling Kubernetes clusters in production
Proficiency in Python for scripting, automation, and infrastructure management
Deep understanding of AWS services such as EC2, S3, RDS, Lambda, IAM, and VPC
Experience with CI/CD tools like Jenkins, GitHub Actions, GitLab CI/CD, or ArgoCD
Strong understanding of networking, security, and container orchestration best practices
Experience with monitoring and logging tools like Prometheus, Grafana, ELK stack, or CloudWatch
Solid understanding of Linux administration and shell scripting
Knowledge of other cloud platforms (GCP, Azure) is a plus
Seniority level
Mid-Senior level
Employment type
Full-time
Job function
Engineering and Information Technology
Industries
IT Services and IT Consulting
Referrals increase your chances of interviewing at Weekday AI (YC W21) by 2x