Role Overview (Core DevOps Engineer)
This is a core DevOps engineering role focused on building, automating, and maintaining scalable infrastructure and deployment systems.
We are looking for a DevOps Engineer with strong hands-on experience in cloud platforms, CI/CD pipelines, and containerized environments. You will work on infrastructure automation, system reliability, and performance optimization across production environments.
This role sits at the intersection of infrastructure, automation, and system reliability, with exposure to GPU-enabled infrastructure for high-performance workloads.
Key Responsibilities
- Build and manage CI/CD pipelines for automated build, test, and deployment
- Design and maintain scalable cloud infrastructure (AWS / Azure / GCP)
- Implement Infrastructure as Code (IaC) using tools like Terraform or CloudFormation
- Manage and optimize containerized environments (Docker, Kubernetes)
- Ensure high availability, monitoring, and logging of systems
- Automate operational tasks to improve efficiency and reduce manual effort
- Troubleshoot production issues and improve system reliability (SRE practices)
- Manage version control and deployment workflows (Git-based systems)
- Support and manage GPU-enabled infrastructure for high-performance workloads
Mandatory Requirements
- 3+ years of hands-on experience in DevOps or Cloud Engineering
- Strong experience with CI/CD tools (Jenkins, GitHub Actions, GitLab CI, etc.)
- Experience with cloud platforms (AWS / Azure / GCP)
- Hands-on experience with Docker and Kubernetes
- Strong scripting/programming skills (Python / Bash)
- Understanding of networking, Linux systems, and system architecture
Good to Have
- Experience with Infrastructure as Code (Terraform, Ansible)
- Familiarity with monitoring tools (Prometheus, Grafana, ELK stack)
- Knowledge of security best practices (DevSecOps)
- Experience with microservices architecture
- Exposure to SRE principles and incident management
- Exposure to GPU infrastructure (NVIDIA GPUs, CUDA environments)
- Familiarity with GPU scheduling in Kubernetes