Role: Devops MLOps Engineer
Experience: 5 - 8years
Location: Chennai
Fulltime - Hybrid
Note: looking for Devops + ML Ops only
Key Responsibilities:
- Maintain and support machine learning applications running on Windows and Linux servers in on-premises environments.
- Manage and troubleshoot Kubernetes clusters hosting ML workloads.
- Collaborate with data scientists and engineers to deploy machine learning models reliably and efficiently.
- Implement and maintain monitoring and alerting solutions using DataDog to ensure system health and performance.
- Debug and resolve issues in production environments using Python and monitoring tools.
- Automate operational tasks to improve system reliability and scalability.
- Ensure best practices in security, performance, and availability for ML applications.
- Document system architecture, deployment processes, and troubleshooting guides.
Required Qualifications:
- Proven experience working with Windows and Linux operating systems in production environments.
- Hands-on experience managing on-premises servers and Kubernetes clusters and Docker containers
- Strong proficiency in Python programming.
- Solid understanding of machine learning concepts and workflows.
- Experience with machine learning model deployment and lifecycle management.
- Familiarity with monitoring and debugging tools, e.g. DataDog.
- Ability to troubleshoot complex issues in distributed systems.
- Experience with CI/CD pipelines for ML applications.
- Familiarity with AWS cloud platforms
- Background in Site Reliability Engineering or DevOps practices.
- Strong problem-solving skills and attention to detail.
- Excellent communication and collaboration skills.
📧 Apply now via LinkedIn Easy Apply or share your resume at naveena@intellistaff.in/mahesh@intellistaff.in