Role : DevOps Lead
Job Description
We are seeking a DevOps Technical Lead with a strong background in infrastructure automation, cloud architecture, and a keen interest in Generative AI technologies. The ideal candidate will lead the development of an Infrastructure Agent powered by GenAI - capable of intelligent provisioning, configuration, observability, and self-healing.
Key Responsibilities
- Lead architecture & design of an intelligent Infra Agent leveraging GenAI capabilities.
- Integrate LLMs and automation frameworks (e.g., LangChain, OpenAI, Hugging Face) to enhance DevOps workflows.
- Build solutions that automate infrastructure provisioning, CI/CD, incident remediation, and drift detection.
- Develop reusable components and frameworks using IaC (Terraform, Pulumi, CloudFormation) and configuration management tools (Ansible, Chef, etc.).
- Partner with AI/ML engineers and SREs to design intelligent infrastructure decision-making logic.
- Implement secure and scalable infrastructure on cloud platforms (AWS, Azure, GCP).
- Continuously improve agent performance through feedback loops, telemetry, and fine-tuning of models.
- Drive DevSecOps best practices, compliance, and observability.
- Mentor DevOps engineers and collaborate with cross-functional teams (AI/ML, Platform, and Product).
Required Qualifications
- Bachelor's or Master's degree in Computer Science, Engineering, or related field.
- 8+ years of experience in DevOps, SRE, or Infrastructure Engineering.
- Proven experience in leading infrastructure automation projects and technical teams.
- Expertise with one or more cloud platforms : AWS, Azure, GCP.
- Deep knowledge of tools like Terraform, Kubernetes, Helm, Docker, Jenkins, and GitOps.
- Hands-on experience integrating or building with LLMs / GenAI APIs (e.g., OpenAI, Anthropic, Cohere).
- Familiarity with LangChain, AutoGen, or custom agent frameworks.
- Experience with programming/scripting languages : Python, Go, or Bash.
- Understanding of cloud security, policy as code, and monitoring tools (Prometheus, Grafana, Datadog).
Preferred Qualifications
- Experience building or fine-tuning LLM-based agents for operations or automation tasks.
- Contributions to open-source GenAI or DevOps projects.
- Understanding of MLOps pipelines and AI infrastructure.
(ref:hirist.tech)